3.2 Offline Scoring Pipelines
We just saw how to run offline scoring in a notebook. It works, but just like in the real-time inference case, notebooks are not a good choice for long-running processes and production-like scenarios. In the case of offline scoring, it’s best to define a pipeline as a series of tasks and submit that pipeline to a backend system that runs and orchestrates the pipeline tasks on the cluster. Let’s have a look at how this works with RHOAI and Data Science Pipelines.
-
In your workbench, double click on the
offline-scoring.pipeline
file. You can now see a graphical representation of the offline scoring pipeline within the Elyra Pipeline editor.
Elyra is an extension of JupyterLab, which offers a graphical editor for pipelines that can be run on Data Science Pipelines. |
-
If you compare this pipeline with the previous offline scoring notebook, you will find the same steps. In fact, the pipeline includes the very same code since the pipeline tasks represent the underlying Python modules of the notebook steps. While the notebook defines a purely linear sequence of steps, the pipeline allows to parallelize tasks such as ingesting the raw data and downloading the trained model, both of which are independent of each other.
-
Submit the pipeline by selecting
Run Pipeline
(second button in top toolbar). ClickOK
. -
Revisit the RHOAI Dashboard and go to the
Data Science Pipelines
section on the left side. Here you can track all submitted pipelines including their versions.
-
To monitor your pipeline, select
Runs
in the left menu and look at theTriggered
tab. Each run represents an execution of a given pipeline. Find the run that you submitted and click on its name. -
You can now see a graphical representation of your pipeline which updates in real-time as the tasks are completed. You can select each task and monitor its progress. Once the final task is finished, you can verify its completion by finding the timestamped results table in your S3 bucket.
Congratulations! You have run your first offline scoring pipeline in RHOAI, which concludes this section of the lab.