Skip to content

Project Overview

There are two separate projects discussed:

  1. Internal Data / Dashboard Project

  2. Marketing / CDMO Search Engine

The internal data project is the first priority and needs to be presented to the CEO/management.

1. Internal Data / Dashboard Project

Current Data Situation

  • Internal data comes from marketing and logistics departments.
  • Data is stored in:
  • SharePoint
  • OneDrive
  • Other internal locations.
  • There is no centralized data lake currently.
  • Data exists mainly as different reports.

Formats of documents:

  • Excel
  • PDF
  • Word
  • Rarely images.

These reports function essentially as a CRM stored in Excel files.

Business Segments

Reports are created across three business segments:

  1. Generics
  2. CDMO
  3. Nutraceuticals

Number of reports:

  • \~10 reports for Generics
  • \~5–6 reports for CDMO
  • \~15 reports for Nutraceuticals

Management currently reviews these reports manually.

Objective

Create dashboards from the existing reports to display key management information.

The dashboards will summarize and analyze the data from the reports.

Example Data Structure

Example scenario described in the meeting:

  • Company sells 25 products
  • Across 10 regions
  • Data includes:
  • Quantity sold
  • Price
  • Customer
  • Region
  • Dispatch information

Example question management might ask:

  • Which regions did Customer B buy Product A in?

The system should allow analysis across these combinations.

Key Sales Report

One major report tracks:

Weekly / Monthly Forecast

Includes:

  • Products expected to be dispatched
  • Sales outlook
  • Completed sales
  • Previous year sales
  • Next year projections

Also tracks opportunities, such as:

  • Potential sales in the next 18 months
  • Opportunities that will not convert into sales

Dashboard Requirements

Dashboards should exist at different management levels:

CEO / Management Dashboard

High-level overview.

VP Dashboard

More operational insights.

Manager Dashboard

More detailed operational data.

All dashboards use the same data but with restricted hierarchy-based access.

Data Granularity Documentation

A separate document exists that describes:

  • Each report
  • What data is tracked
  • Granularity level
  • Purpose of the report
  • Who reviews the report
  • Typical management questions

This document will be shared after review.

Security Requirements

Data is highly confidential.

Requirements:

  • Must remain on-premise
  • Cannot leave company systems
  • No external hosting.

Possible access methods:

  • VPN
  • On-premise data scientist.

Development Approach

Step 1:
Build a mock-up tool using synthetic data.

Step 2:
Demonstrate how the system works.

Step 3:
Once approved, connect to the actual internal data.

Marketing / CDMO Search Engine

Objective

Create an internal search engine that can search across internal documents.

Purpose:

  • Quickly retrieve information from internal documents.
  • Avoid manually searching through files.

Data Sources

Documents stored in:

  • SharePoint
  • Excel
  • Word
  • PDF

Data is currently unstructured.

Search Engine Functionality

Users should be able to ask questions similar to ChatGPT.

Examples:

  • Financial information about a company.
  • Product pipelines from stored reports.
  • Information from internal company research.

The system should only use internal documents.

It should not interact with the web.

Example Use Case (Small Molecule Business)

The company supplies custom small molecules to customers.

Example query:

  • Compare sales of a molecule supplied by Divi’s to a customer.
  • Check if another manufacturer supplies the same molecule to GSK.

All this data already exists in the internal database.

The system should retrieve it quickly.

Customer Relationship Data

Internal records track:

  • Commercial customers (approx. 15–20)
  • Target companies (wish list of \~150)
  • Responsible business development contacts
  • Molecules associated with each company
  • FDA database information
  • Meeting notes
  • Trade show interactions
  • Unit visits

Example query:

  • Have we met this customer before?
  • What discussions happened?

Currently this requires manually searching documents, which can take 1–2 weeks.

The search engine should answer this quickly.

Document Traceability

The system should show:

  • Which document the answer came from
  • Allow users to open the source file.

Workflow Notes (Generics Sales)

Sales process depends on region.

Example:

  • US / Europe → structured regulatory process.
  • APAC / MENA → faster conversion to sales.

Sampling is tracked through an internal tracker.

Additional Notes

  • Pricing is usually discussed later in negotiations.
  • Deals typically begin with one molecule, not multiple molecules at once