How to Automate a Big Data ETL Project if You’re an Insurance Major. Part II

The Secret sauce of our Big Data success is our Business System Analysis unit.

undefined
BA insights
Client stories
ETL/ELT
Insurance
8 min read

The project I'm going to tell you about today started with one ETL process. Now it is a full-fledged insurance data platform with five ETL processes serving six business lines with clean, well-structured, verified data for client insights.

There are 23 people on the project, four of whom are Business Analysts (BAs) – guardians of order and documentation. After all, it is not the amount of data that affects the result, rather than the quality and whether you have taken into account all the factors to see the real picture.

Hi, I’m Alesia Sumkina, Lead BA at Symfa, and today I’m going to tell you how we keep the project for our global insurance client in order so that they get the best of their data. Read the first part of the story here – our Delivery Manager Andrey Zhilitsky tells about the beginning and evolution of the project, and the team growth.

Table of Contents

  • The one-minute-long story of our project (for those who skipped the first part)
  • How I keep an eye on the project to provide clean data for the client
  • How do we make sure no issue is lost?
  • How do we stay on the same page as to who’s responsible for what?
  • Why not use Jira instead?
  • Why is a BA a must-have for Big Data projects?
  • What’s the value that BAs generates for the client on this Big Data project?
  • Last but not least

The one-minute-long story of our project (for those who skipped the first part)

The project started with a simple problem – source data comes in a variety of formats from dozens of agents, and in different languages. The system that previously was used to manually handle those data was outdated and no longer safe to use. Data needs to be processed so that the company understands whether they are in profit or loss. Thus, the data should be converted to two formats – premiums (how many insurance policies were sold) and claims (how much money the company paid against those policies).

The overview of the system that Symfa developed for the client (as of year 2022)

Diagram D B2

The overview of the system as of year 2024
Conceptual Model 2

With higher data volume, more data issues come up, as identical mapping rules do not apply equally well to all data. More issues mean more solutions have to be found, and someone has to make sure that no issue is lost. Otherwise it’s the domino effect – one unresolved problem creates a handful of others and brings dirt into the data that we’re trying to get clean and reliable.

How I keep an eye on the project to provide clean data for the client

As a BA Lead, I’m in charge of the project consistency. This is achieved through a very (I mean it) simple tooI called tracker. Basically, a tracker is an Excel file that contains multi-level information about the tasks, issues, stakeholders, resolution, bottlenecks and status.

I introduced the first tracker to give the client the stats on our progress – the volume of data processed over the given period. The idea grew on the client, and more trackers followed.

Timely implementation of a certain tracker is immensely empowering.

undefined

How do we make sure no issue is lost?

Through trackers.

For big data projects, there’s no identical data set, exceptions pop up here and there. Before us, a lot of regular issues and exceptions were only captured in emails or meetup notes, thus never making it to the developers’ board, and got easily lost. Thus, a lot of questions remained unanswered. We decided to put an end to this malpractice and make sure to make all the development steps crystal clear for the client and well-documented.

How do we stay on the same page as to who’s responsible for what?

Trackers again.

With 23 individuals on Symfa’s side and dozens of stakeholders, system users and analysts on the client’s side, it’s hard to follow who does what. Both the business units and Symfa IT team make changes in one common online document that allows us to quickly update and control the status of the issues.

Tracker 1 2

 

undefined

Above are the trackers opened first at the meetings with the client

Why not use Jira instead?

First, because we use Azure DevOps for progress monitoring with this client (reporting, resource allocation, etc.).

Second, it wasn’t about choosing the most technically advanced tool, but about meeting business needs. The same thing can be done in Jira and Azure DevOps hands down, but the option we chose allowed us to do it faster and stay with the business units on the same page. An online doc where each issue status is color coded is a fast, easy and consistent way of keeping business stakeholders in the loop.

We made ourselves an aggregator where business requests from different sources are collected and a big picture is seen in detail. The tracker is a safe space where requests are aggregated, analyzed, understood, and the blockers are highlighted. We polish the requests before they get to the board. When they get to the board, it is a detailed task for the developers. Business requests, in contrast, come in various shapes and forms and often bring more questions than answers.

Why is a BA a must-have for Big Data projects?

A BA is the link between the development team and the customer. This connection is maintained through documentation. This is the documentation set we’ve got ready for this project:

  • Project Overview
  • SAD
  • Test documentation
  • Data migration document (historical data over several years) – WIP. This one will tell everyone involved on the project after us how the data comes from stage 1 to stage N and what happens to it at each stage.

Through these documents, the client sees that we code exactly what was discussed. At the same time, they see which limitations the software we use for coding has, and the project limitations in general.

Documentation Simplifies Onboarding to a Great Deal.

Furthermore, since the project is dynamic, a BA timely highlights to the team the changes in the data processing pipeline. For this, we also have a separate document where we take down all the changes and modifications.

What’s the value that BAs generates for the client on this Big Data project?

Stabilized data for OLAP Cubes

The client used to do the analytics in the Excel files, manually and the process hasn’t seen much of a change since the 90’s. Given the data volume has grown tremendously, to stabilize the data, ensure transparency, consistency and scalability was our task.

To add data from multiple vendors into cubes and build reports, the data must be unified and stabilized. A BSA oversees the project, divides the areas of responsibility and conveys the customer's wishes as accurately as possible for the team, to ensure the expected value from our involvement.

We started working with one business line, where there were only 2 clients. Now we process six business lines, all of which are separate businesses, with their own specifics, products and solutions.

Project trackers 

This is very meticulous work I’m talking about: walking through the board and tons of daily emails, bringing everything in one place to highlight the issues. Not to mention getting in touch with the stakeholders for the answers! Through trackers we make sure that there are no distorted performance results, guesswork, issues that are left hanging in the air which affect the future scope.

Documentation

The huge scope of the project allowed us to grow and to expand the BA team. Three more BAs joined and took over a big chunk of the work, thus creating a window for me to prepare quality documentation.

I immediately started with the project overview and showed it to the client:

–  Is this what you need?

–  Yeah, that's what we need.

After the first development cycle, I wrote the SAD:

–  Here’s what we’ve got. Do you see the constraints? It works like this and looks like this, if you break it down into layers. Are we on the same page?

– Yes.

If it wasn’t for my team, though, I would not even have the opportunity to write the documentation, because we would have been bogged down in routine tasks.

Last but not least

A professional BA team with good expertise always immerses itself in the project and gives the expected (or extra) value to the client. Through the proper distribution of responsibilities – preparation of scope, distribution of tasks, timely identification and elimination of blockers – the best result is achieved. We onboard, train, do the docs, translate the vision of the client into the codable tasks and ensure seamless project knowledge transfer for the client.

If someone on the client’s team needs help with data that we generate, the client sends them to our BSA team for the answers. We know everything about it – how the project grew, why it grew certain parts and how those work. If there is no such team on your Big Data project, it’s similar to navigating a dense forest without a map.

More Like This

BACK TO BLOG

Contact us

Our team will get back to you promptly to discuss the next steps