RECENT POSTS
- Introducing modelx-cython: Boosting modelx with Cython Compilation
- A First Look at Python in Excel
- Enhanced Speed for Exported lifelib Models
- New Feature: Export Models as Self-contained Python Packages
- Why you should use modelx
- New MxDataView in spyder-modelx v0.13.0
- Building an object-oriented model with modelx
- Why dynamic ALM models are slow
- Running a heavy model while saving memory
- Running modelx in parallel using ipyparallel
- All posts ...
Testing GPU model on Cloud
Feb 20, 2022
Recently a test to run a lifelib model on GPU has been conducted as detailed in this post. The GPU used for that test was Nvidia GeForce 1650 on a consumer desktop PC, which is primarily for gaming. The next step was to conduct the same test using GPU on a cloud platform, which is much more powerful than GeForce.
Finally, the test was performed using a high-performance GPU for accelerated computing available on Google Cloud Platform (GCP). This post is about how the test went.
Profiling the GPU model
Before running the test model on GCP, the model was profiled. The table below shows the top 3 of the most time-consuming formulas.
Formulas | No. of calls | Duration (secs.) |
---|---|---|
max_proj_len() |
1 | 3.857875109 |
mort_rate(t) |
277 | 2.302189112 |
model_point() |
1 | 0.788285255 |
The profiling result showed that the max_proj_len
formula
was taking nearly 4 seconds of the total run time of 10 seconds,
despite that it was called only once.
The max_proj_len
formula was defined as below:
lambda: int(max(proj_len()))
proj_len()
returns a CuPy array, each element of which holds the length of the projection steps of each model point. It seems that the max
standard function operates on the CuPy array
inefficiently.
Based on the analysis, the formula of max_proj_len
was changed as below:
def max_proj_len():
return int(proj_len().max())
After the change max_proj_len
took almost no time and the total run time was
reduced from 10 to 6 seconds.
Cloud GPU shortage
Due to high demand for GPU accelerated computing capacity and a severe shortage of GPU supply, among GCP (Google Cloud Platform), AWS and Azure, GCP was the only provider that approved a quota to use a GPU enabled machine. But even on GCP, it was not always possible to launch a GPU enabled-machine in January, due to resource exhaustion.
Running the model on Google Cloud Platform
Here are the specs of the GCP virtual machine(VM) instance used for the test.
- Machine type: n1-standard-8
- GPU: Nvidia Tesla V100
- GPU memory size: 16 GB
- OS: Ubuntu 20.04
Jupyter notebook server was run on the VM, and the model was uploaded and run on the VM through Jupyter notebook in the browser running on the local PC. The actual notebook used for the test is here.
The model ran with 250,000 model points successfully, and it took about 12 seconds, as detailed in this stack trace. The model failed to run with 300,000 due to GPU memory exhaustion.
Since the same model with 50,000 model points took 6 seconds on the local GPU,
The models runs only 2.5 time faster on the cloud GPU than on local GPU.
During the run on the cloud GPU, GPU utilization was monitored by gpustat
, and
the utilization % was up to only 20% at maximum.
Although the test result is not very impressive, it does not mean actuarial models runs slow on GPU. GPU cores have potential to make actuarial models run drastically fast when they are fully utilized. However, to make a model run effectively on GPU, the model needs to be optimized delicately. In the case of this test, one small tweak saved 40% of the total run time. Also, the cost to transferring data to and from GPU memory needs to be taken into consideration and minimized.
- Older
- Newer