Although data science as a job function is relatively new compared to roles like software engineer or database administrator, in the age of “Big Data”, more and more companies are building data departments. The data scientist role is typically created when a company reaches a good size, builds a respectable data reservoir and wants to gleam insights from this data. As a result, being a data scientist at a tech start up, before it receives funding, is uncommon and I wanted to write a post about my experience in just such a company.
In this post I will recall my history briefly and then talk about the key experiences I had working in a start up environment and the lessons I learned along the way – this is not a technical post so I won’t be going into detail about what I did or how I did it. The aim of this post is to give data scientists some food for thought if they’re considering taking a chance on a smaller company.
Short History
I worked for a London based tech start up and came from previously working as data analyst for a multinational company. I took the chance on this company because I wanted to make a full transition from data analyst to data scientist (there is a difference) and consequently joined as a junior data scientist. I stayed there for a little under two years under within which I experienced the company obtain funding, change offices to a bigger space and my team grew from just my manager (a data engineer by title but able to do pretty much everything) and I to 4 people full time. I got promoted to Data Scientist in about a year and was embedded in the team for the flagship product of the company.
I did a bit of everything – from data visualisation and statistical modelling to helping migrate our data warehouse from PostgreSQL to BigQuery. I was involved in a lot of great projects and we as a data team made an important contribution to the company. After it was over, I reflected on what I had learned through my tenure and found that I had been completely transformed as a data person – I went from academic statistician to an applied data scientist and finally feel comfortable in my professional skin. In the next section I explore the main lessons I learned along the way.
Key Lessons Learned
You will feel useless at first – it’s normal
I joined as the first official data scientist and discovered that there was no defined path or workload for me in the first few months. I recall speaking with my manager who told me his vision for the nascent function and what he hoped to accomplish in the coming months – I was inspired but we had no idea how to get there.
In the very early days I spent my time getting to grips with our data warehouse, the nature of the data we collected and its limitations. Output was minimal but I had the freedom to explore and interrogate the data we had in any way I wanted with the hope that we might better understand our customers and product.
This was unsettling for me initially and it can be summarised by the phrase “suffocation by freedom” – I had no discernible output and had the distinct feeling that people thought I was coasting; especially compared to the software engineers who were churning out product features and improvements by the week. It wasn’t until later that I realised that this exploratory period was necessary for meaningful work that came subsequently. As data scientists, our work is based on understanding the business uses for data and we show value by producing insights that drive decision making for the product – this is a type of work which is distinct from software engineering and output for us should not always be measured in the number of commits to Github.
You can make a Huge Impact
The best part of working for a smaller company is the opportunity to drive your own projects and have a significant impact in your work. When I finally became confident in my role I was working on projects that added value to the company and the type of skill this involved was not found elsewhere in the company. When you join a place where data usage is not a part of the culture, it is ripe for basic analysis projects which land big as quick wins. By the time I got around to building models (and deploying them as apps) it garnered significant backing within the company and gave our team the visibility we wanted – I also came across as a magical fortune teller with spooky future predicting powers. I was a Prophet but in the Facebook library sense not in a religious sense.
Moreover, since there is not a set way of doing data science, you have the freedom to implement ideas you think are worth pursuing. The obvious downside is that you can pursue foolish ideas without reproach, but this is a tradeoff I was willing accept. Fortunately for me, my manager knew a lot about data science and acted as a formidable filter. I researched ideas that I don’t think I would have if I was at a more established company. I tried (and failed) at implementing Markov Chains, I learned about survival analysis and found a use for it, I even looked into social network analysis and made beautiful graphs – whether it was time series or machine learning or stochastic systems I was given the freedom to look into it and find a use case . I learned so much about a variety of statistical topics and this led to a huge jump in my understanding of data science and what can be done.
Data Engineering is a part of your job
You absolutely have to know how your data warehouse and schemas are designed. You will never get clean, curated datasets for analysis and this is especially true for a start up who for a long time was dumping their data events without a thought to how it may be used in the future. As a result, you have to become an expert data wrangler both in SQL and Python/ R.
I benefitted enormously from having a data engineer as my manager as I learned about real time data, various database optimisation methods and how data goes from raw product events to a tabulated form which can be interrogated. It is my belief that the line between data engineering and data science is becoming increasingly blurred and although we don’t need to be experts in this, it will only help to learn these concepts – don’t take your data pipeline for granted.
Garbage in Garbage out
For the first year our data, both in terms of volume and number of features, was not enough to perform Machine Learning (ML). Since data scientists love to boast about how many XGBoosts and SVMs they fit, this was a small regret for me. This forced me to be creative and think about using statistical methods from which we can make inferences, generate confidence intervals and use data efficiently.
Forget “Big Data”, small data analysis is where it’s at and you have to develop a good statistical intuition to get the most out of your data. On the occasions we did fit fancy algorithms to our early stage data – we obtained nothing but meaningless junk. Machine Learning is not a substitute for careful planning, understanding context and clear thinking. When we finally obtained the data volumes suitable for machine learning we understood our data well enough to use the methods responsibly. If you think you can blindly throw algorithms at a data problem – think again!
Don’t Silo Data Science
As our team grew, I became a product embedded data scientist. I worked alongside engineers, product managers and technical leads who were all maniacal about the product and constrained their activity based on the good of the product. This was eye opening for me because I worked day in and day out with engineers and product managers who force you to think about your work in a more applied, product centric way. It also has the auxiliary benefit of showing the value of data science to non-data people in the business. I believe that having product embedded data scientists working with other disciplines is necessary for creating a data driven culture within a team.
What I Liked
The previous section talked about the key lessons I learned working at a start up. I want to talk briefly about what I liked:
- Opportunity to make a big impact – Mentioned in the last section, I felt I really added value to the business and was able to see what that was.
- No set way of doing things – This gives one the freedom and independence to try different things professionally.
- Room for quick wins and the ability to set up a pattern of data work that comes from you vs someone else.
- Work gets implemented fast – It isn’t until you work in a big company that you realise how much of a blessing this is. I could start a piece of work on Monday and by Friday have it be used and acted upon by the company.
What I Disliked
- Not having a data driven culture – When you come into a business that wants to underpin their decisions with data it must be understood that this is a habit and not a one time event. At times I felt our team took one step forward and two steps back when it came to having our work considered seriously by leadership. It takes buy in at all levels to be data driven, not just lip service.
- Having to constantly justify your existence – This stems from the above point but since a data function is new, it has to constantly deliver value in order to justify its place in the company. Teams like Marketing and Sales don’t have to do this as much. I don’t want this to be misinterpreted as me saying that people don’t need to provide a reason why they should be at a company but the expectations for how we deliver value was not as clear as it was for Marketing and Sales. With a lack of clear expectation of results from the top down, it’s hard knowing how you’re expected to add value. This may not necessarily be a small company problem.
- A small team can be limiting – Having a large data team consisting of more senior members with different specialities would have provided an efficient way for learning new ways of approaching data problems. There were definitely some cases where I wish I had someone more experienced in statistics to ask advice from at the office. Consequently, this meant I had to follow the wider data science community by going to conferences and workshops in order to meet with specialists in other companies and discuss my challenges with them. This practice yields dividends until today – you don’t get better unless you spend time with people better than you.
Would I do it again?
Overall, working at a start up was the best working experience I’ve had. It was challenging, fascinating, weird, frustrating, painful, satisfying and fun all at once and I loved it. It’s a trial by fire which is necessary for growth. My journey is unique and there are many idiosyncrasies involved but I think you would find a congruent experience for data science positions at similar sized companies.
If you’re a data scientist contemplating working for a smaller company, I would advise you to see who you will be working with daily – these people are the ones who will make or break you and if you have a good feeling from them then it is well worth the risk.
Would I go back at some point working for a data team at a start/ scale up? Without a doubt. If you face a similar choice in your life then you might just be in for the most fun time of your career!
I had a similar experience before joining IBM and I agree 100% with what you said. This feeling of making one step forward and immediately two steps back because of a 5 minutes meeting with management…. That was terrible. But the result of a struggle is somewhat always positive. I have also learned so many things completely outside the scope of my job description, from writing optimised DB queries to deploying applications all the way through the different staging environments. So yeah, good experience afterall I guess