Years of unstructured news data synthesized
Reduced manual effort
Billion news articles processed
Dow Jones has a 30+ year archive of premium news articles that continues to grow by an estimated 1 million incoming news articles each day. The organization wanted to provide scalable, flexible access to their 1.3 billion document premium news repository, which is among the world’s largest, via its new cloud-based content processing and storage platform, called Dow Jones DNA.
Recognizing the need to showcase the depth and breadth of the DNA dataset, Dow Jones wanted a solution that could process large volumes of historical and streaming business news documents and find hidden insights by transforming text into named entities ( i.e. people, locations, money and events) and the relationships among them. In fact, they found that these articles could serve as data points that can inform evolving industry demand in portfolio management, sales, business development, risk target identification and aggregation of deal opportunities, among others.
Quantiphi processed their terabyte-scale, unstructured data corpus and developed a Knowledge Graph framework to help data scientists and developers discover insights related to network effects and business impacts of rare global events, such as a major natural disaster. Customers can also visualize other key events, hidden relationships, or unseen opportunities that could impact their business. The tool leverages Google Cloud Platform, the Dow Jones DNA – Data, News & Analytics service, TensorFlow, and a graph database platform to perform text mining, machine learning, data integration, and enterprise advanced analytics.