How do we help people without technical background understand the mechanism of a machine learning model?
Ongoing HCI Research Project
Advisor: Nikola Banovic (PHD), Anind Dey (Faculty)
My Role: Research, UXUI Design, Dashboard Design, Prototyping
Keywords: Data Visualization, Machine Learning, Human Behavior
I am currently working as a research assistant at the CMU Ubicomp Lab in collaboration with the team developing a machine learning model for predicting human routine behavior.
My role is to design a dashboard that would present the data in a more intuitive way, so the end users can understand how our model makes prediction and generate insights from it. I am the sole designer and have been working closely with the engineers.
Under different kinds of environments, human beings react differently and their individual behaviors then affect back on the surroundings. We describe our surrounding environments and actions as sets of features. By encoding the features of particular situations and their triggered behaviors, we are able to map out the intertwined influences, and thus to model the human routine and help people improve suboptimal habits.
This model can be applied to many potential fields and the one we focused on is the healthcare realm.
Proposed Work Flow
How will we use this ML model?
The existing model lacks of a application that can present the enormous amount of data efficiently. Thus people without training cannot easily make sense out of it, but rather get overwhelmed.
The core persona was developed following our research development, which were based on some of the traits from our research demographic.
My goal is to shine light into this “black box” – to find an optimal solution to help people visualize the unknown process of the model, while enabling the user to perform tasks. It can be broken down in to three main parts.
- Demonstrate the probability of each possible state or specific action at different time segments.
- Visualize the most salient trend among the population.
- Predict which trend would lead to certain outcome
I approached the visualization ideation from both bottom-up and top-down routes, by sketching out the demo of stats representation, and the overall birdview.
Top-Down: How can we organize and allocate each feature efficiently so they would show the general trend in an self-explanatory way?
Bottom-Up: How can we present the enormous amount of statistics in features in an intuitive and clear way?
All data fed into the model is represented by the graph in the center of the dashboard. Situational features are shown in orange whereas actions features are in blue, and the final stage that user defines as the outcome is in green. Each situation is followed by its triggered action, showing in a sequential order along the timeline.
A Graphical Representation of Probabilities
Each row represents a single feature (e.g. pain level), and each column represents the values used to describe that feature (e.g. no pain - severe pain).
I decided to use grids with different transparency to present the degree of statistics as for its easiness to discover trend and good readability: the more transparent it is, the less likely that specific value would occur.
How would Dr. Foster use the dashboard to achieve his goals?
1. Hover - check the statistics
Hovering on the timeline shows the date that specific data were collected. Hovering on the grid in the graph shows the parameter and the feature that you land on.
2. Filter - spot the trend of interest
If Dr. Foster wants to see the trend for patients who have not done the surgery, he can click the grid that represents “Surgery = False”. The expanded trend is composed of data from only the non-surgery patients. He can continue digging in the trend in depth.
3. Pin - fix the trend of interest
What if he wants to divide the patients based on whether they have been rehospitalized yet – click and drag the trend into the pinning area. The trend then is divided into two separated ones.
4. Result Prediction
If Dr. Foster selects “Rehospitalization = True” as the end state, the model will provide the aggregated trend that will lead to this result. The result is the prediction generated by the model, based on previously data. We don't define a fixed spot as the final result; instead, the user has the flexibility to choose what feature he wants to see at the last stage.
Design with the experts
My overall design process was mainly expert-driven, by which I stayed close to a few who know the model well and shifted directions based on their feedbacks. Once the interface is developed, we will adjust our focus to a user-centered approach to improve the end user's experience.
For the fields where professional knowledge plays as the key barrier for entry, the qualitative difference between the “Wisdom of Crowds” and the “Wisdom of a Few” shall be leveraged at different points along the design and innovation life cycle.
Since it is a pilot project in a new realm, we faced many constraints. As the development continues, we are determined to make progress on these areas:
- Migrate to using real data and refine the interface
- Create trend for individual participant
- Create the knowledge propagation feature (e.g. users can manually delete sub-trend that are known to be impossible)
- horizontal collapsing feature (users can hide sections that they are less interested, and transit to different spot along the timeline easily)