Our client, a UK-based startup aims to transform real estate by harnessing publicly available data for a cutting-edge Machine Learning model. This model will provide precise property price forecasts for the UK over the next 5 years, empowering investors and enthusiasts with data-driven decision-making capabilities. With a focus on refining predictions despite limited historical data, the startup seeks to gain crucial insights for informed decision-making, gaining a competitive edge in the market, and maximizing opportunities while mitigating risks.

Customer Challenges

Data Integration from Diverse Sources

The challenge of managing over 40 data sources proved formidable. While some offered convenient APIs, others required manual download and upload, complicating integration efforts. Nonetheless, NeenOpal's dedication to thorough data integration ensured seamless amalgamation, strengthening their predictive model.

Limited Historical Data for Monthly and Yearly Predictions

NeenOpal encountered a critical hurdle in generating precise monthly and yearly real estate price predictions due to limited historical data availability. Overcoming this constraint necessitated a meticulous approach to feature engineering and model architecture.

Managing Geospatial Heterogeneity in Data Granularity

The geospatial intricacies posed a unique challenge as not all data points were consistently at the LSOA level; some spanned across the entire UK, while others were at MSOA or different regional levels. NeenOpal had to grapple with the task of harmonizing diverse data granularities to ensure a cohesive and accurate predictive model.

Solutions

NeenOpal achieved optimal AWS resource utilization by integrating AWS Glue and Lambda for streamlined data preprocessing. The multi-model NHITS approach, coupled with a robust data integration strategy using Amazon Web Services, has enhanced predictive modeling accuracy across diverse data sources.

The data preprocessing pipeline, involving AWS Glue and Lambda, seamlessly orchestrated data from diverse sources into the RDS. Furthermore, the team optimized computational resources by utilizing both CPU-based and GPU-based EC2 instances.

The utilization of EC2 instances, specifically GPU-based, for training the final model in Python underscores NeenOpal’s commitment to harnessing high-performance computing to achieve accurate predictions in a resource-efficient manner. This challenge emphasized the need for strategic utilization of Amazon cloud resources to enhance computational efficiency and model training capabilities.

The final model, centered around NHITS (Neural Hierarchical Interpolation for Time Series), employed a nuanced approach. Instead of relying on a single model, NeenOpal implemented multiple instances of the NHITS model, each tailored for different input and output horizons.

To streamline this complexity of collecting data from over 40 sources, NeenOpal implemented a robust data integration strategy. AWS Glue and Lambda functions were instrumental in automating the extraction of data from API sources. Additionally, for manual uploads, a seamless process was established using Amazon S3 — files uploaded to S3 triggered Lambda functions, ensuring efficient and automated data transfer to the Amazon Relational Database Service (RDS). This intricate data orchestration process was essential to manage the diverse data landscape and lay the foundation for accurate predictive modeling.

AWS SERVICES USED

AWS Lambda
AWS Lambda
AWS Glue
AWS Glue
Amazon RDS
Amazon RDS
Amazon S3
Amazon S3
EC2
EC2
Secrets Manager
Secrets Manager
CloudWatch
CloudWatch
SNS
SNS

Why choose NeenOpal?

NeenOpal Inc. is an AWS Advanced Services Path, Differentiated Partner and we are in the Public Sector and Well Architected Partner programs. As a forward-thinking consultancy, we specialize in unlocking the transformative power of data to drive business growth. Our team comprises AWS Certified experts committed to staying abreast of the latest industry advancements.

Benefits

Conclusion

This case study details the practical implementation of a forward-thinking solution. Utilizing publicly available data, the developed Machine Learning model forecasts property prices for the next 5 years. The tangible benefits of this solution extend beyond precise predictions, including optimized marketing budgets and enhanced overall performance, providing stakeholders with invaluable insights for strategic decision-making.

Written by:

Barnali Shil

Senior Associate Consultant

LinkedIn

Madiha Khan

Content Writer

LinkedIn