a. Introduction:
- Customer Background:
- Propstack is a leading provider of commercial real estate (CRE) data, analytics, and technology solutions. The company offers a comprehensive B2B platform that enables real estate professionals, investors, and corporations to make data-driven decisions.
- Challenges:
- The current CRE/CRM system is not suitable for analytics, as it only stores transactional data and lacks historical data storage and retrieval. This limitation prevents retail real estate companies from analyzing historical trends, making data-driven conclusions, and forecasting or setting clear goals for the future.
b. Solution:
- Implementation:
- The project involves creating a DataLake on Azure Cloud to accumulate daily data transmissions and consolidate it into a format suitable for historical analysis and reporting, as well as serving as a foundation for machine learning (ML) forecasts for sales and purchases. Key steps include:
- Cleaning and transforming data
- Consolidating historical data
- Combining data from multiple sources
- Creating pipelines with dataflows to merge data from different mandates
- Creating PowerBI reports and dashboards showing the evolution of metrics with appropriate KPIs
- Providing a chatbot that answers analytical questions about operational data in the data lake using Azure OpenAI and SQL data sources in AI Search Service
- Team Composition:
- One Software Architect expert in Microsoft services and Power BI (PBI)
- Two PBI Developers (Certified recently in Microsoft Fabric)
- One Data Scientist
- One Software Developer
- Features Used:
- DataLake creation
- SQL Server
- data transformation and combination
- Data factory
- historical data consolidation
- Data factory
- pipeline creation with dataflows
- Data factory
- Power BI report and dashboard design with KPIs
- Power BI
- chatbot implementation,
- App service
- App service plan
- Azure OpenAI
- Search Service
- Storage Account
c. Results:
- Customers
- Centa Immobilien (Propstack Customer) already uses that Datalake solution.
- We create reports for 14 companies (List here all customers that we create reports – even for Flowfact)
- Benefits:
- Real estate companies can now access historical data and understand how changes influence results.
- The data, stored independently in the DataLake, can be migrated to any other solution.
- Power BI reports and dashboards are created faster than before due to the common data model in the DataLake.
- Ad-hoc reports can be made for specific purposes by any PBI developer, eliminating the dependency on requesting new functionalities in the original platform.
- All steps from data extraction to presentation in dashboards and reports are well-documented, using Microsoft services, making future modifications and extensions straightforward.
- The solution can be plugged into any CRE/CRM software, we just need to map the external data with our common data model in DataLake house
d. Conclusion:
- Summary:
- The creation of DataLake enables Propstack customers to take ownership of their data for multiple purposes. The Power BI reports now include historical data, have a better layout, are fully customized, and provide better performance than the original CRE/CRM system.
- The services used, including Microsoft Fabric, Azure Services, Databricks, OpenAI, and Power BI, are well-known and can be extended for future needs.
- The new Lakehouse provides an independent single source of truth, enabling ML analysis and chatbot interaction.
- Future Plans:
- In the next phase, we aim to scale the product to Propstack’s 300 customers.
- Extend the usage of the chatbot (named Dainsy) for real estate data insights.
- Future steps include running models for supplier forecasts and classifying customers/suppliers as good or bad payers to make informed recommendations.
Propstack/Remax background information
Propstack (https://www.propstack.de/) is a special CRM system for the real estate industry.
The client (Remax) is a franchise real estate entrepreneur with international orientation. The franchisees are independent real estate offices that are separate economic entities. A franchisee can have from one office in one location to several offices in different locations. Certain information should not be accessible across franchisees. Other predominantly property (real estate) information is. Evaluation requests will come from the franchisor as well as from franchisees. Duplicate data is to be expected during the project.
- The Propstack CRM provides a separate database area for each customer. In the constellation of a franchise group with franchisor, franchisee, various offices and brokers, there are several database areas per franchisee. a
Usually, this is where the first challenges for Propstack arise:
- The franchisor has projects/objects that he would like to make available to all franchisees. As usual, the projects/objects are then copied into the database for each franchisee.
- The franchisee makes projects/objects available to other franchisees. As a rule, the data is copied into the other databases here as well.
- In both cases, it is a challenge to have up-to-date reservation lists for the properties of all those who distribute the projects/objects.
- The access to the databases of the customers is done via an API interface provided by Propstack. Since the API is only insufficiently set up, there are massive performance limitations with the CRM system. For this reason Propstack uses the possibility of a data dump and extracts all data several times into a dump. This dump is the database that can be used for reporting purposes. Access is via an API that is set up identically to the API to the CRM system. As mentioned under 1., this results in an individual data dump for each customer. In the case of a franchise company, this results in a large number of data dumps; in the case of Remax, there are only 22 data dumps for Hungary. The use of Propstack is planned throughout Europe. As shown in point 1, the copying of project/object data results in many duplicates, which must be taken into account in a consolidated view.
- Due to the current structure, the entire transformation of the data must currently be carried out in the report each time, which is problematic for performance reasons.
- The Propstack data dump is usually complete and is an image of the CRM system. In Propstack, data is usually not historised. An exception are tasks and deals. No historical data can be retrieved for a project/object.
- The naming of the data collections and columns is inconsistent in the CRM, public API (documented), and Datadump (with minimal documentation). The CRM is in German and the translations vary in the APIs and their documentations.
- The data in Propstack is dynamic. At any given point a customer can add a customer field in the CRM and create a new column in a data collection (table). By default, PowerBI scans the first 1000 rows for column profiling. This can be changed and set to use the entire data set, which would slow down the processing even more.
Below a draft from the current architecture ans the implications
Propstak
Current solution problems:
The existing problem, every new customer they need create a new dump. There is a cost involved
API could increase the number of request if the number of customer increase
Licensing and Security (public link)
Centa MG
Current solution problems:
Is hard to refresh the data, we refresh and get all the data again
We need to create new queries every new customer
Power BI – API