The Data Science Renaissance
“If people knew how hard I worked to get my mastery, it wouldn’t seem so wonderful at all.” – Michaelangelo
Have you ever think of the common place shared by the Renaissance and today’s irresistible digital trend? In the following passage, the author believes what the Renaissance to the ancient world is what data science to the modern history and offers a distinct insight into the comparison.Ultimately it was about a new way of thinking.Five essential elements are listed as the pre-existing condition for a successful Renaissance to survive in the information-flooding era. Several potential use cases are also presented in the passage.
It takes you 8 mintes to read.
Renaissance means rebirth. A variety of factors, coming together at the same time, can spark a rebirth. In the analytics world, we are facing a confluence of factors: economic disruption, a great re-skilling, and unprecedented access to data. The combination of these factors is sparking the rebirth of data science, with the expert-led model a relic of the past. History is a great teacher, and demonstrates that this Renaissance is not all that dissimilar from the original.
The Renaissance was a development that took place in Italy beginning in the 1400’s. Renaissance artists broke away from the norm (simple styles) and inspired a new era of expressionism. Ultimately, the Renaissance was about a new way of thinking, which sparked a period of extended innovation in the arts.
In the late 1300’s, Florence emerged as an affluent city, with the wealthy using their riches to hire local artisans. As typically happens, this movement led to competition, which in turn stoked creativity. This continued in the 1400’s when the Medici family rose to power in Florence, and used their money and influence to continue the movement. In the 15th century, the Renaissance spread rapidly from its birthplace in Florence to the rest of Italy and then to the rest of Europe.
While there were numerous artistic masterpieces created in this period, the one perhaps most associated with The Renaissance is the sketch of the Vitruvian Man. Combining a circle and a square, with man in the middle, the piece was symbolic of the combination of two things: the heavenly and the earthly. The idea was first postulated by the writer Vitruvius, giving the figure its name. But, it was da Vinci who was credited with first illustrating the idea in an anatomically correct way. A reminder that those with the original idea, are not always the ones that make the mark on history.
The modern day Renaissance in data science is also about a new way of thinking, and draws many parallels from The Renaissance of many years ago in Italy:
1) It is economically driven, as the cost of compute, storage, data, have enabled the funding of a new artisan enlightenment.
2) The new artisans can be anyone, not just the wealthy few or those trained in a certain discipline. The expert model of data science is ending.
3) The application and confluence of data and science are being combined into a modern day vision of the future: continuous intelligence through the application of machine learning and deep learning.
Neither the Renaissance in Italy nor the one in data science would exist without a certain set of pre-existing conditions. In both cases, the market conditions enabled the creativity and served as a launching point for future innovation. In the case of the artisans of Florence, as they began to understand science and its application, new technology could develop out of their imagination (think of Leonardo’s early helicopter drawings). Similarly, today’s data science Renaissance is determining the winners and losers in each industry, and those that adapt will survive, driven by application of a new technology.
Today, organizations aiming to harness data science instinctually know what they need to do, there is just have a prescriptive roadmap which leads to success and a leadership position. Most companies leap to model building and algorithm selection. For some companies, this is the right place to start. But for others, it may be a step too far.
Charlie Munger tells a story about a plane is flying over the Mediterranean Sea, making its way towards an exotic location. The pilot’s voice comes on the intercom and says, “A terrible thing just happened, we’re going to have to make a water landing. The plane will stay afloat just long enough to open the door and let everybody out. We have to do it in an orderly fashion. Everybody who can swim, go to the right wing and just stand there, and everybody who can’t swim, go to the left wing and just stand there.”
The pilot continues, “Those of you on the right wing, you’ll find a little island just two miles off. When the plane goes under, just swim to the island and you’ll be fine. For those of you on the left wing, we’d like to thank you for flying with us today.”
Most organizations feel like they have been abandoned out on that left wing of the plane. No guidelines, no assistance. Just an obvious set of challenges. Machine learning problems are data problems. Data science will fundamentally change, automate, and optimize all industries. But, it starts with the basics: the essential elements of data and analytics.
A data strategy is an enabler of data science, because all data is dirty before you feed it into a model. The 5 essential elements of data and analytics create the proper pre-existing conditions for the Renaissance in data science.
The 5 essential elements are:
# Open source is a key enabler of a comprehensive analytics strategy. Openness ensures innovation and speed, while linking to innovation sitting on top of the open platforms. Open source is an eternal community of innovation.
# Unified governance is necessary for insight and compliance. Unified governance does for data what libraries have done for books. Organize, catalog, mask, protect, archive, and make any asset instantly findable. A data library provides insight, but also compliance with key regulations (such as GDPR).
# Hybrid data management prepares an organization for a multi-cloud world. It aligns on-premise and private cloud data investments with public cloud deployments. Whether the focus is on unstructured data or structured data, the future of data management is both private/public, with seamless integration between the two.
Visualization is about data discovery. Understand data assets, render them in the form the user expects, and enable the data to be manipulated and explored. This is dynamic and real-time, not static.
# Machine learning and #data science are ingredients across all of the essential elements of Analytics. This is the source of ‘a ha’ moments, as an organization enhances and automates decision making and operations. Build, deploy, and train models. Continuously learn as new data comes in. Machine learning and data science must be resident where the data resides for maximum impact.
The starting point is different for every individual, department, and organization. But, the 5 essential elements are consistent. They are the pre-existing condition for a successful Renaissance. While many organizations have done something in each of these areas, most of that was done during and for the prior era.
We are entering a new era of simplicity. Analytics and data science approaches must be simple: installed and running in 15 minutes or less. With the previous era of long, expensive projects, the IT department was the scapegoat. In this new era, IT is no longer the victim of business transformation. Instead, IT, like the Medici’s in 1400 Italy, lead and enable this Renaissance.
Data science is coming into form, with machine learning use cases leading the way. Companies are starting to win with machine learning and there are repeatable patterns to drive outcomes. Take for example a pharmaceutical company that is accustomed to a market where new drugs take 12–14 years to make it to market, w/ an average cost of $2.6 billion. In this case, data science and machine learning was applied to reduce the cost by 70%. The algorithm was trained on two distinct datasets, one on the toxicity of various chemicals, with the other on known side effects from approved medicines. From both datasets, the algorithm was able to predict the toxicity of the medicine with reasonable accuracy.
Many use cases have emerged. Here are the Top 10 I see today:
The only constant here will be change. I expect the Top 10 will evolve every 6–12 months.
As mentioned previously, while da Vinci was the first to correctly draw the Vitruvian man, history shows that he did not come up with this idea on his own. The secret was to make the geometric shapes off-center, and the credit for that goes to Giacomo Andrea da Ferrara. Giacomo Andrea’s version was riddled with iterations, eventually leading to success. He and da Vinci were colleagues, shared meals, and were seen together. Yet, history only remembers da Vinci’s version.
The difference was a bias for action. da Vinci took action, while others just iterated and stayed in experimentation mode. A Renaissance is a call to action, not a call to reflection. The time is now for data science.
By Rob Thomas, author of The End of Tech Companies and Big Data Revolution.