Oil & Gas

PLAY PAUSE

0:00

/

PLAY PAUSE

Confidential Сomputing for Machine Learning in the Oil and Gas industry

Industry Domain:

Oil and Gas

Encryption method:

Fully Homomorphic Encryption (FHE)

Data type:

Table

Learning Algorithm:

Logistic regression

Challenge:

Detection of Sand Production

Customer

A group of companies consisting of more than 20 enterprises in the field of diagnosing flow and integrity throughout the oil and gas well system, from the wellbore to the reservoir, empowering their customers to make better decisions and improve asset performance.

The Customer invests a significant portion of annual revenues into R&D, collaborating with universities and industry partners to advance diagnostics. Their in-house expertise spans program design, data acquisition, tool and sensor manufacturing, software development, and data interpretation, establishing the Customer as a unique and trusted leader in through-barrier diagnostics.

Among the Customer specialties are:

Through-barrier diagnostics
Well & reservoir flow assessment
Well integrity assessment
Field-wide reservoir assessment [multi-layer, cross-well]
Measurement modeling
Oil and gas, and energy

Challenge

All oil workers are very protective of their data. Access to data is strictly guarded, and this occurs not only between competing companies but also between subsidiaries and even within the same company, between different departments developing various fields. Maintaining the confidentiality of such data is the number one priority. In some cases, the situation is exacerbated by the fact that data cannot leave the borders of certain countries by law.

On the other hand, companies are actively developing predictive diagnostics for well operations using artificial intelligence methods. A clear obstacle to the development of ML algorithms is the unwillingness or inability of data owners to share their data with ML developers due to potential threats such as leaks, theft, and illegal use.

During the extraction and storage of gas, gas condensate, and oil, operators around the world face the problem of sand production from wells.

Sand production leads to equipment failure, reduced well productivity, and increased operational costs.

This problem is acute in the oil and gas industry, both at production wells and underground oil and gas storage facilities.

The solution is a system with ML models for detecting sand production at oil and gas production and storage facilities, trained on a large amount of data from various sources

Thus, to comply with confidentiality requirements and train the models, relevant data owned by different entities was exchanged in encrypted form.

It is important to emphasize that the data was protected throughout the entire process, including the stages of machine learning and inference.

Fully Homomorphic Encryption by Guardora

Protection of all important data on the owner’s side

Transmission of protected data to the ML team

Storage of protected data

Training the ML model on protected data

ML model quality check on protected data

Return model result in protected form

Withdrawal of protection and interpretation of the results obtained by the data owner

Inference

Solution

Machine learning on encrypted data.

Guardora achieved full training of the ML model on encrypted data followed by inference on encrypted samples

In addition, if the model is decrypted, it will be applicable for inference on public data with the same characteristics. This is useful when the model is trained in the cloud or on third-party encrypted data and then used on its computational resources or its unclassified data

Accuracy = 0.73585
Training result on open data for logistic regression with two parameters

FHE Accuracy = 0.72453
Training result on Fully Homomorphic Encrypted logistic regression data with two parameters

Inference of encrypted data.

Guardora also trained the model on public data and then adapted it for the inference of encrypted samples

This was useful for a training sample that did not contain sensitive information, but the data that were processed subsequently were sensitive

Accuracy = 0.86792
Inference result of unencrypted samples on an XGBoost classifier trained on public data with two parameters

Accuracy = 0.84960
Inference result of FHE-encrypted samples on an adapted XGBoost classifier trained on public data with two parameters

Results

In the pilot project, 4 companies participated (the Customer, 2 end users, and Guardora)

The volume of sand production data for joint training came from 11 wells of 2 end users.

Geophysical data did not leave the trusted boundaries of the data owners; the exchange was carried out only with encrypted values.

A joint model was formed on the server based on the encrypted model information from both users.

The accuracy of the joint sand detection model increased from 70% to 85%.

Significant adjustments were made to the well workover planning for 3, 6, and 12 months.

Unplanned workovers were minimized.

Well downtimes waiting for workovers were reduced.

Environmental impacts caused by hydrocarbon leaks were eliminated.

The direct savings and optimized costs due to the predictive analytics of well condition amounted to approximately US$ 200 million per year.

Work on predicting corrosion development in wells was planned.