The demonstration we propose aims to address the state of the art limitations. We propose a mechanism that mines business processes from email logs. We consider that human intervention is often required to generate reliable business process models but we aim to minimize it. We propose then an approach that enables users to collaboratively and gradually build an annotated corpus of messages and to automatically classify these ones into processes, instances and activity IDs using machine learning techniques. This is illustrated in Figures 1 and 2. In Figure 1 we can see the user manually annotating an email in order to enrich the learning dataset. This is achieved through the email client graphical user interface. In Figure 2 we see an annotated email, which is done automatically by the machine learning model. These two features are also illustrated in the video 1 (process name and activity name prediction).
Our proposal differs from existing works by: (1) Minimizing the effort required for building a training dataset through a collaborative approach (2) Building more discriminative classification and clustering features using references, named entities, email correspondents and email exchange history (3) Applying a fast and non-parametric clustering algorithm for process instance detection.
Once the list of processes are detected, the list of events (activity names) per process is detected, and finally the the associated process instance identifier for each event is detected we build an event log per process on which we can apply any process mining algorithm. Figure 3 shows the discovered business process model on a dataset which contains 180 emails related hiring business process.