MDM Solution for Healthcare & Pharma Industry
Simple Problem Statement
- Managing the following Master Data domains:
- PARTY Domain
- Patients
- Health care providers (HCP)
- Doctors (which is a subset of HCP)
- LOCATION Domain
- Health care organizations (HCO)
- Hospitals (which is a subset of HCO)
- Health care organizations (HCO)
- PRODUCT Domain
- Drugs @ NDC
- Medical Devices
- PARTY Domain
- Managing the following Reference Data domains:
- NUCC taxonomy
- NPI number
- Managing the following Metadata domains:
- Business metadata
- Technical metadata
- Managing Compliances!
My Setup
Hardware
Server: Macbook Pro
Software
MDM Product: Orchestra Network’s EBX5
Database Server: Postgres SQL
Detailed Capabilities & Use Cases
Consolidation Use Case
Match Engine Vocabulary
Data Quality
Check the data quality of the source data by performing some data profiling activities. This activity can be performed either within the MDM system or outside of the MDM system.
Data Cleansing
- Junk character removal
- Data type transformation
- Concatenation
- Split
- Standardization based on cross-reference dictionary
- Data enrichment using 3rd party services.
Data Life Cycle
- Audit Trail
- Data Lineage
- Horizontal Lineage
- Vertical Lineage
Stewardship
For every cluster that the system is unable to resolve it will spawn work items that will seek attention of subject mater experts to resolve them.
As the above dashboard shows there are 4 such records which needs attention and there are 4 workflows which is started to resolve those.
The user can “take and start” any item that is offered to the group of user that he belongs to.
When matching steward starts working on those work-items it takes the user to s specialized UI only for 1 specific cluster. And the user has these options:
- The user can dtermine that it is not a duplicate hence
- The user can take the suspected record out of the cluster and prep it to participate in matching process again.
- The user can decide that the record is a unique golden
- The user can decide that the record is not needed and can soft delete it.
- The user modify this record and add missing values then merge it to the Pivot.
- The user can take this record out and manually add it to a different cluster that he/she is aware of.
- or the user can determine that it is indeed a duplicate hence
- The user can auto merge it using the survivorship logic
- or the user can manually merge it using the merge UI.
Multi-Domain
Relationship between multiple domains – HCP & HCO example
Customer/Patient 360 view
Hierarchy Management
Data Governance
Example of Data Governance through Maker Checker workflow:
Patient Onboarding Workflow
- Business user starts onboarding process. The system enforces you to search for a Patient record using the search match logic.
- If a matching record is discovered. The business user can edit it to either modify or enrich.
- If no matching record is discovered then the search information is used to create a new record.
- Once a record creation process is complete can be sent for approval
- An approver can verify the item
- If the approver agrees with the created item he/she can approves this or
- He/she can send it back to creator for modification
- Once an item is approved the record becomes part of the Golden master.
Patient Update Workflow
- Business user searches for a Patient record using the search match logic.
- If a matching record is discovered. The business user can edit it to either modify or enrich.
- Once a record update process is complete can be sent for approval
- An approver can verify the item
- If the approver agrees with the updated item he/she can approves this or
- He/she can send it back to creator for modification
- Once an item is approved the record becomes part of the Golden master.
Data Model Design
HCP/HCO
Patient
Architecture Design
- Modular Java architecture
- Optional web apps can be deployed independently in DEV, UAT, Prod (e.g. no data modeling in prod, no insight in UAT)
- Standard Java Web Application packaging for projects
Advance Problem Statement
The use case is to address the Master Data Management in Healthcare Industry. In order to do so, I am trying to demonstrate the Patient administration system (PAS)
PAS Systems provides a number of core essential functions to hospitals:
- Master Patient Index
- Appointment Booking
- Waiting List Management
- Record of Patient Activity
- Activity Returns/Billing
- Reporting
- Admission
That will involve mastering the following entities (if I may call them so)
- Patients
- Health care providers (HCP)
- Doctors (which is a subset of HCP)
- Health care organizations (HCO)
- Hospitals (which is a subset of HCO)
- Products/Drugs
This also involves managing several reference data like
Ward
Bed
Etc…
Now, Master data by itself won’t be able to reveal a lot about the patient. We can demonstrate some predictive analysis features. This is possible by combining the master data with big data and live streaming data.
One of such use cases is:
Getting Ahead of Patient Deterioration
In order to do that we will need access to live streaming data for patients. And combine it with the patient’s medical history as we are storing that in MDM and send a notification.
Rather it will work in conjunction with those systems. The real use case is to demonstrate MDM in Pharma. Sorry, my explanation will be a bit lengthy and boring and most of it you or others are already aware of. But for everyone else.. this is what I was thinking.
Our main demo can be divided into two categories:
- Identity Data – such as patient, provider, and location identifiers
- Reference Data – which includes common linkable vocabularies such as ICD-9, DRG, SNOMED, LOINC, RXNorm, and Order sets.
The solution should be able to stitch data from other niche systems such as a Lab Information System (LIS) or a Radiology Information System (RIS) and other custom applications, as well as with the focus on interoperability and Health Information Exchanges (HIE), it is imperative that we send the right patient identifiers across systems.
Also, more organizations are using analytics to help gain insights to drive care coordination and population health management. Analytics need a clean set of data to be useful and hence it is extremely critical that master data be managed
From the EBX standpoint, our key use case is to demonstrate Master Patient Index (MPI). The concept that is used to manage Patient data. It includes assigning a unique identifier for each patient that can then be used by other systems and applications to refer to a patient. An organization has to decide on a matching approach as it works to consolidate the patient records from the various systems to create a clean master system of record. The most commonly accepted approach is algorithm-based, where an MDM system matches the patient’s identifiable attributes such as name, date of birth, address, SSN, etc. to find duplicate or similar records. The algorithm can use either a probabilistic approach or a deterministic approach. In EBX we do both.
- A probabilistic matching algorithm assigns a likelihood score to the records, to indicate whether they refer to the same entity based on the acceptance of certain volatility in the data. The higher the score, the greater the likelihood there is a match between records.
- The Deterministic approach matches a subset of the key attributes and if they are an exact match then it indicates that the records refer to the same entity.
The probabilistic matching approach allows for the greatest flexibility and provides the highest accuracy when properly configured. The quality of data across systems however usually contributes to false positives and false negatives. I was hoping to use AMIT’s AI/ML in this scenario to determine the false positives and address that. It’s a long shot at the moment.
Based on the few MDM implementations I did in the past life (Pfizer, Merck, Boehringer Ingelheim). I can tell that one of the biggest issues with Master Data in Pharma is data quality which includes duplication, fragmented data, lack of standardization, and incomplete information.
So, we can use our key strength to demonstrate
- Data Governance – encompasses the management and ownership of data within an organization. It includes the people, processes, and technology needed to make sure the data is secure, accessible, available, and used in an appropriate way. Data Stewards who are essentially embedded in the business, understand the workflow, and are empowered to make decisions about the data, are people who enforce standards and help make governance a reality in an organization. Data stewards are responsible for the data quality of the domain data.
- Data Integration using BW or TCI – which will involve the process of making sure all systems are using the Master Data from the system of record. Organizations can cascade their Master Data to other ancillary systems in either a transactional mode or a batch mode. A transactional approach is more real-time whereas the Master Data is updated, the information is sent to ancillary systems and at all times all systems are synced. An example of this would be, any time a new patient is created/updated in the EHR (if that is identified as the producer of the patient information), a real-time HL7 message is sent to the LIS or RIS and other systems to make sure that the most updated patient information is available at all times. In Batch mode, data is extracted from the designated source system on a periodic basis and uploaded/updated to the ancillary systems. In this approach, there is a lag between systems and hence this is not the preferred way but is still better not having a process in place.
- Data Remediation – involves the process of addressing data quality as well as matching issues. Not all information can be matched using algorithms and sometimes manual intervention is needed to address issues.
Kinshuk Dutta
New York
Very Informative. What all addons have you used in above problem statement apart from DAQA
The above solution was using the following add-ons
DAQA, DQID, DMDV, ADIX