Ali Qudsi startup can read our future from data

Ali Qudsi promises three advantages when companies use his software: “More Sales, Lower Costs, Less Risk.” It seems that he found the Philosopher’s Stone. In fact, the apps developed by Databriks are all about search and find. Work is carried out through algorithms that improve themselves with the help of artificial intelligence – the so-called machine learning.

The combination of the latest complex technology and classic, easy-to-understand corporate value makes Databriks very popular among investors today. In February, the startup Ghodsi raised $1 billion in a funding round from investors including BlackRock, Salesforce Ventures and Andreessen Horowitz. With a total valuation of $28 billion, Databriks currently ranks 9th on the global list of Most Valuable Startups. According to the company, it has 5,000 customers worldwide, and more than 40 percent of the Fortune 500 will use these services.

Genes, shoes and oil rigs: Data templates analyze everything

What makes Databriks so desirable? The software company uses artificial intelligence-controlled algorithms to comb through massive amounts of data to identify patterns. Relationships no one has yet discovered. Pharmaceutical companies, for example, can analyze the DNA of millions of people and compare the genetic mutations that cause disease risks. For example, Databriks customer Regeneron identified the gene that causes chronic liver disease and developed a drug based on this knowledge.

Shopping sites can individually group their products and place them in such a way that the customer instantly sees everything that really matters to them – and buys more. Zalando Or the fashion chain Hennes & Mauritz Found in Databrick’s client list.

Treasure in the data lake

So is hotels.com, which uses real-time data analysis to show every visitor to its site exactly which hotel it thinks will most likely convince potential customers to book. But also the huge worldwide machine complex of Royal Dutch Shell Under constant AI monitoring: here the algorithms trigger the alarm when defects are announced – that is, before something happens.

Only for a while on o2online.de: Samsung S7 Plus tablet with unlimited LTE | 5G high speed internet volume (ad)

Behind these predictions, recommendations or warnings are simple statistical relationships hidden in unimaginably large databases. The large gathering of this data began about 15 years ago as hard drives became larger and cheaper. At that time, data centers had virtually taken off from the ground. Everything that was piled up was stuffed into storage, regardless of whether it was still needed or not. Experts talk about data lakes. Even every medium-sized company now stores petabytes of data.

For a long time, no one knew what to do with the data lakes, but neither did anyone want to drain them – there could be some value in them. The assumption was correct, as companies like Databriks are now mining for these treasures.

Billions and billions of data points

In principle, they filter data lakes, extract relevant information and present them in easy-to-read overviews. This seems simple, but it isn’t. Although he can lay out the basic principles of such analyzes on a piece of paper, Qudsi says in an interview with FOCUS Online. But the real challenge is the huge amount of data. “You are comparing billions and billions of data points, all in real time. Ten years ago that would have been impossible.” Because at that time the computing power of servers in data centers was not enough – and because there was no software on the market that could perform such analyzes on such a scale.

So Databriks develops such software. It is available for all popular cloud solutions that now also provide the necessary computing power: for Amazon Web Services, Microsoft Azure; The availability of Google Cloud Services was announced a few weeks ago. In very simplified terms, Databriks algorithms look for patterns in data volumes. They learn this themselves using test data, which they analyze over and over in countless training rounds.

Joining forces: data processing and artificial intelligence

After each round, the algorithms write their results into models, then the developers improve them. “To do this, we combine data processing with artificial intelligence,” Qudsi explains. “Our developers are working on both to push the models to perfection.” At the end of this procedure is a model that has been trained to do a particular action – for example, to predict events. This is then unleashed on real data and then continuously developed.

AI-driven data analysis is the “next big thing.” Databriks promises nothing less than solving humanity’s biggest problems with its technologies. Power Supply, Crime Fighting, Epidemic Control: Databricks algorithms involved in developing coronavirus vaccines can recognize when 40-year-olds pretend to be teenagers on forums, discovering vulnerabilities in power grids that no one else has found.

The global volume of data is increasing by about 27 percent each year, according to estimates by the International Data Corporation about two years ago. Starting at about 33 zettabytes in 2018, we will break the 50 zettabyte barrier this year. By 2025, the total volume of data will increase to 175 zettabytes. One zettabyte is one billion petabytes, one petabyte is 1,000 SSD, each with a capacity of one terabyte. These numbers are hard to imagine, but they bode well for Databricks: “Our algorithms are designed to find needles in a haystack,” says Ali Qudsi. “The more hay, the better.”