Unlocking Product Knowledge: How NLP Can Empower ChatGPT in Retail

Unlocking Product Knowledge: How NLP Can Empower ChatGPT in Retail Source: ExpertGuru Articles

The Why?

In the dynamic world of retail, staying ahead of the curve means embracing technology that can enhance customer experience and streamline operations. Natural Language Processing (NLP) stands at the forefront of this technological revolution, especially when integrated with advanced conversational AI like ChatGPT. Using ChatGPT, at ExpertGuru, we have streamlined an approach to reap the benefits of AI-driven personalization in the digital marketplace.

What is the Problem?

Generally, product recommendation engines have the following types of user queries.

  1. Product Specifications: Queries about the features, materials, sizes, or colors available for a product.
  2. Product Availability: Questions regarding the stock status of items, including sizes or colors.
  3. Usage-Based Recommendations: Inquiries about the best products for specific needs, occasions, or based on user preferences.

We are designing ExpertGuru such that it caters specifically to queries about product specifications, availability, and usage-based recommendations, incorporating contextual understanding through ChatGPT to greatly enhance user experience. Consider the following example, where the user is planning a hiking trip:

User: I need a backpack for a three-day hiking trip. Something durable and light. Any ideas?

ExpertGuru: For a three-day trip carrying about 20 lbs, I recommend the "TrailMaster 45L Hiking Backpack." It's water-resistant, weighs 2.5 lbs, and has a comfortable fit, perfect for your needs. It’s in stock in blue and green. Interested?

User: Yes, does it come with a rain cover?

ExpertGuru: Yes, it includes an integrated rain cover, multiple compartments for organization, and external attachment points for gear. It's designed for durability and comfort on multi-day hikes.

Here is the step-by-step process any intelligent system needs to cater to before suggesting any recommendations:

  • Contextual Understanding: Capability to understand the user's needs - a durable, water-resistant backpack suitable for a three-day hike.
  • Product Specifications Query: It searches the product database for backpacks that match the specifications - durable materials and lightweight.
  • Product Availability: The system checks the stock status for backpacks that fit the criteria.
  • Usage-based Recommendations: Considering the length of the hike and the user's preference for a lightweight option, the system also factors in ergonomics and user reviews for comfort during extended use.

Implementing Product Recommendations

Integrating Shopify Data with Elasticsearch

To enable effective product recommendations, our initial task involved transferring product data from the Shopify platform into our system. This process was designed as a two-step operation. Initially, we aimed to extract comprehensive product information from Shopify and bring it into our own environment. To achieve this, we utilized Elasticsearch, a powerful search and analytics engine, as our database choice. By indexing critical product details such as descriptions, titles, and other relevant information in Elasticsearch, we laid the foundational infrastructure necessary for our recommendation engines to access and analyze Shopify product data efficiently.

The Challenges

In the Shopify API, product-related information is accessible through their GraphQL and Shopify Admin API endpoints. This data is provided in JSON format, encompassing various details such as product descriptions, inventory levels, and variant information. The product descriptions are particularly notable as they are stored in HTML format, inputted by shop owners via the web interface. This necessitates parsing the mix of text and HTML to extract meaningful information. Additionally, the API offers access to other useful metadata, including product images, inventory details, and the diverse options available for product variants.

During the product recommendation phase following ExpertGuru, we encountered several challenges which can be categorized into three main areas:

  1. Structuring Product Data: This pertains to obtaining structured product information for use in recommendations, such as product descriptions. This information varied greatly in format across different websites. It was sometimes accessible directly through Shopify APIs as HTML files, which required us to parse HTML tables. At other times, it was presented in image format or as a mixture of text and tabular content.
  2. Embedding Logic for Recommendations: About embedding logic for product recommendations, we utilized various retrieval methods to identify a list of potential product recommendations based on user queries. This process required the integration of diverse techniques to effectively match products with user interests.
  3. Handling Diverse Inventory: This involves managing recommendations for a diverse inventory with varying levels of product information, popularity, and availability. This includes strategies for recommending new or less-known products alongside bestsellers.

How did we bring Shopify merchant data to Elasticsearch?

Source Data Integration

To harness the full potential of Elasticsearch as a vector database for product recommendations, we initiated a comprehensive data integration process. We developed a robust pipeline that systematically fetches product data from Shopify stores and stores it in our SQL Database. Utilizing Shopify's API, we were able to access a wealth of detailed product information. This included not only basic specifications like features, materials, sizes, and colors but also critical inventory status. Such a holistic approach ensured that our dataset was both rich and reflective of the current offerings available on Shopify.

Data Modification

To maintain consistency in product information and better recommendations, we streamlined a normalization process. This process involved transforming the varied data formats into a standardized schema. This unified schema was important to facilitate a seamless representation of product attributes, enabling efficient querying and analysis. Additionally, it served as the foundation for subsequent indexing and search operations within Elasticsearch, ensuring that data inconsistency did not hamper the performance of our product recommendation system.

Data Modification Techniques for Elasticsearch Indexing

In the process of indexing product information in Elasticsearch, several critical data modification techniques were employed to ensure the efficacy and accuracy of search results. These modifications were pivotal in transforming raw product data into a searchable, insightful format, enhancing the capabilities of our product recommendation system.

1. Cleaning Input Product Data

The first step involved extracting textual content from the HTML body objects where product descriptions were embedded. Given that these descriptions were stored in HTML format by Shopify store owners, it was imperative to parse this HTML content accurately to retrieve clean, plain text. This parsing not only involved stripping away HTML tags but also ensuring that the textual integrity and formatting cues important for understanding the product were preserved. This meticulous extraction process was essential for subsequent processing and embedding stages.

2. Chunking of Product Description

To maintain the contextual integrity of product descriptions during the indexing process, we implemented a chunking strategy. The approach should segment the product descriptions into coherent, contextually intact blocks of text without losing any contextual information that could occur if descriptions were arbitrarily truncated or segmented. By maintaining contextually rich chunks, we ensured that the semantic embeddings generated later would capture the full essence of the product descriptions, thereby enhancing the relevance and accuracy of search results. To create a chunk that had a complete sense of the product, we used ChatGPT prompts! We instructed ChatGPT to create a product description summary for longer inputs.

3. Addition of Meta Information: Product Variant Information

For handling queries where the user needs a bifurcated product recommendation with more information like the product size, colour, and material along with product availability information, we needed to add product variant information as a meta field in Elasticsearch.

4. Enabling Hybrid-Based Search

To further refine the search capabilities of our system, we incorporated additional information that facilitated a hybrid-based search approach. This approach combined traditional keyword search with semantic search capabilities. By indexing product information in a way that supports both exact match queries and semantic relevance, our system can interpret and fulfill a wider range of user intents. Whether a user inputs specific keywords or phrases that hint at desired product features, the hybrid search mechanism is designed to return the most relevant results by leveraging both keyword and semantic matching techniques.

We have defined a keyword-based approximation search strategy in Elasticsearch. This involves traditional text search queries on variant fields in combination with similarity searches using vector fields.


These data modification techniques collectively enhance the search and recommendation capabilities of our Elasticsearch-based system. By carefully parsing, chunking, enriching with meta information, and enabling hybrid search, we've created a powerful tool that significantly improves the e-commerce shopping experience, making it more intuitive, accurate, and user-friendly.

Recent Articles
Our Savior:
Mar 2024 - Carlos
Retrieval or Generative?
Mar 2024 - Ike