Register
Screen Shot 2021-09-14 at 1.16.34 PM.png
Completed

Steam Optimization and Other Oddities

$25,000
Completed 127 weeks ago
0 team

Announcements

Data Update (10/5/21)

We have become aware of a data issue in the “Steam Optimization and Other Oddities” challenge. Thank you to Xeekers Tomy4reel and Nick L. for bringing these issues to our attention - we will be sending you a special prize as a thank you!

To address this issue, we will be doing the following:

  1. The leaderboard has been reset. 

  2. Variables "GAS_H" or "RMNG_OIL_H" have been removed from the private holdout dataset. Contestants should remove these variables when training their models.

  3. A new test dataset is now available. Please ensure you are using the new test dataset and not the old one when generating predictions. 

  4. A new starter notebook with minor edits is available. 

  5. We will be adding an additional $5,000 to the prize pool, for a total prize pool of $25,000. The total prizes will now be as follows:

  6. First Prize: $12,000

  7. Second Prize: $6,000

  8. Third Prize: $3,500

  9. Colorado School of Mines First Prize: $2,500

  10. Colorado School of Mines Second Prize: $1,000

Check-In Signup

Sign up for a check-in on October 14th to earn 5 bonus points to your final score. The contestant sponsors will be on hand to get an early look at your work, provide feedback, and answer questions. Sign up here!

Team Registration

If you'd like to register as a team, please do the following:

  • All teams members must individually sign up for Xeek and join the challenge

  • One team member must email hello@xeek.ai with the names of all team members (team members should be CCed on the email.

  • Only one member per team is allowed to submit predictions. Teams must conform to all submission requirements (only 5 prediction submissions per day per team). If teams are found in violation of this (multiple members submitting predictions), all team members are subject to disqualification.

Final scoring of submissions

Final submissions must include a Jupyter note, requirements file, and short video explaining the submission. Final judging will look at performance of the code against blind test data, creativity of the solution, and reproducibility of the results. Submissions must be able to be run by the judges in a reasonable amount of time (<6 hours). We will use a SageMaker ml.c5.12xlarge instance. You can learn more about the specifications for this instance here: https://aws.amazon.com/sagemaker/pricing/instance-types/.

Challenge Description

Aera has been an active energy producer in California for the last 24 years.  Aera is always seeking new ways to more responsibly produce energy. One such initiative is to reduce the amount of input energy necessary to operate their oil fields. These oil fields have high viscosity oil that requires steam injection for production.  Aera Engineers are working to find opportunities to maintain production levels while reducing the amount of steam necessary and thus the energy needed to operate.

This is where you come in! Aera is excited to have the Xeek community dig into geoscience data collected from decades of operations to create a model that describes steam movement and oil desaturation for one of their primary fields.  While predictions have been built in the past, Aera is interested in seeing a diversity of thought in solving this issue that only the Xeek Community can provide.  

Background

Aera’s method of hydrocarbon extraction is similar to methods developed elsewhere in the world with high viscosity oil.  For a good overview about the physical processes happening to produce the data please check out Zerkalove (2015).  If you have access to journal articles from the Society of Petroleum Engineers you can also review SPE 11219 and SPE 13348.

SteamInjection.jpeg

Figure 1: Diagram of how steam injection is used for enhanced oil recovery (Zerkalov,2015)

At a high level, in the subsurface where Aera operates, there are hundreds of sands saturated with high viscosity oil.  A complex of injector and production wells are drilled into different sand intervals. Some of these wells inject steam while others have pumps to extract the oil.  As the steam moves through the sand from injector wells, it mobilizes the heavy oil and pushes it towards the production wells.

The data you are given for this challenge has been collected from producing wells over decades of operations.  The data records the changes to steam injection and oil production over time for each sand in a well.  A well can penetrate numerous sands.  Also included in this data are properties of the sand (i.e. Dip) and the relative location of the producing well to injector wells.  

Description of the Dataset

Contestants will be supplied a data table with all the necessary inputs to build a model.

Aera_column_descriptions.JPG

Table 1: Description of data columns

Some other considerations regarding this data:

  • A “sand” is an individual layer.  A “reservoir” is a family of sands that are adjacent and have similar properties.

  • Some sands may end before reaching an injector well which is why there are several NaN values for some of the columns. An NaN value means sand is not present.

  • The distance from producing well to injectors is an important factor but instead of providing map locations, the dataset has relative distances between the nearest three injector wells.

Target Output

The target variable for this competition is the PCT_DESAT_TO_ORIG. Models should predict this variable. Additionally we welcome creative visualizations or other analyses as detailed in the Evaluation Criteria.

Evaluation Criteria

During the challenge, a quantitative score will be used to populate the leaderboard. Participants will be expected to upload CSVs as described in the Starter Notebook. Submitted CSVs will be compared to withheld test data. The scoring algorithm will derive a similarity score by computing a root mean squared error, accruing the total error across all points in the submission.  A lower score is considered more successful.  Contestants can submit up to 5 CSV predictions per day.

Screen Shot 2021-09-17 at 9.12.55 AM.png

Figure 2: Map view of one producer well and three injector wells

At the end of the challenge, contestants will be asked to submit a zip file containing their code in a Jupyter notebook, a requirements file, and a 2-4 minute video explaining the advantages of their approach and how the code works. 

Submissions will be judged using a mix of quantitative and qualitative criteria. Qualitative criteria help ensure that the models can be easily deployed and display creativity.  The final score will be calculated based on the following criteria and will be assessed on a total score of 100 possible points :

Performance

(Up to 60 total points)

This criterion will use the same method as the leaderboard during the competition with a holdout portion of the data. The top 20% of scores will receive full points. Other submissions will receive points based on how closely they were to the top performing submissions. The team must be able to independently verify your results to receive points. Please ensure that your submission includes detailed requirements and instructions on how to run the code, to ensure we are able to run your final model in our environment.

Creativity

(Up to 20 total points)

A panel of judges will be reviewing the notebook and submission video looking at how contestants explored the data, what features were analyzed, and the ingenuity of the model created for this challenge.  Submissions with exceptional creativity will receive full points. Other submissions will receive points based on the level to which they meet the criteria.

Interpretability

(Up to 20 total points)

This criterion focuses on the degree of documentation, clearly stating variables for models, and following standard Python style guidelines.  Submissions with exceptional interpretability will receive full points. Other submissions will receive points based on the level to which they meet the criteria.

At the end of judging, final scores will be placed on the Xeek challenge page. If a tie occurs then the judging panel will break the tie by evaluating the level of code documentation.

Citations

Zerkalov, G. (2015): Steam Injection for Enhanced Oil Recovery; http://large.stanford.edu/courses/2015/ph240/zerkalov2/