We help investigative reporters across the globe to follow the money using secure technology and data alchemy.

Journalism’s Deep Web: 7 Tips on Using OCCRP Data

This article was first published on GIJN.org.

The Organized Crime and Corruption Reporting Project (OCCRP) Data Team has developed new features on OCCRP Data in the past six months and brought together more than 200 different datasets. Its new software is now configured to let reporters search all of those at once.

OCCRP Data, part of the Investigative Dashboard, offers journalists a shortcut to the deep web. It now has over 170 public sources and more than 100 million leads for public search – news archives, court documents, leaks and grey literature encompassing UK parliamentary inquiries, companies and procurement databases, NGO reports and even CIA rendition flights, among other choice reading. (All this is publicly available. If you’re associated with OCCRP, you’ll have access to more than 250 million items).

Uniquely, the database also contains international sanctions lists detailing persons of political or criminal relevance.

The new platform makes searching diverse types of objects, such as emails, documents and database entries from corporate or land registries into a unified user experience, with an appropriate way to display each type of data.

Here are seven tips to help you get the most out of OCCRP Data:

Browse Directly on Your Screen

OCCRP Data has emails, PDF and Word documents, contracts, old news archives, even Rudyard Kipling poems (from Wikileaks, to be fair). Its brand new interface makes it easier for you to view documents, search within them and preview them in the browser without having to download and open them, making research a faster and more seamless process.

New Search Filter Options

OCCRP Data lets you filter search results by sources, document type, as well as emails, phone numbers, addresses, entity names, countries and more on its left-hand column, after you’ve run your search.

Highlight Connections

You can explore structured data in new ways because OCCRP Data uses entity extraction on documents and emails to find phone numbers, names of people and companies, addresses, ID numbers and other key linkage details of interest. Just click on an entity and see the “Tags” option in the preview screen.

Do Bulk Comparisons

OCCRP Data is capable of cross-referencing the information on two lists; it also ranks data that closely matches and lets you compare the information. Click on a source and then click on the “Cross Reference” option to choose another source with which to do the comparison.

Monitor Search Terms, Receive Alerts

OCCRP Data now has an alerts feature that allows you to monitor a search term so when new information is added to the database you will receive a notification. Simply switch on the bell icon right next to your search query.

Language Support

OCCRP Data now supports multiple languages. The interface is translated and supports Russian and Bosnian-Serbo-Croatian. Search results on the database can also be filtered by language. The data team is working on adding other languages, such as German and Spanish.

Advanced Search Operators

You can use complex search operators to do things such as proximity searches, exact term searches, take into account spelling errors and combine queries.

Any Questions?

Anyone accessing OCCRP Data can check out the the Aleph Wiki where the data team covers its uses, function and development roadmap. Journalists and technologists alike can read the user manual or contact data@occrp.org to give us feedback.