# Price Comparison PWA Solution Structure This document outlines the comprehensive solution structure for a price comparison PWA with NestJS backend and PostgreSQL database. ## System Architecture ```mermaid graph TD A[Web Scrapers] -->|Extract Data| B[NestJS Backend] B -->|Store Data| C[PostgreSQL Database] B -->|Serve API| D[PWA Frontend] E[Users] -->|Use| D ``` ## Backend Structure ### Database Schema ```mermaid erDiagram PRODUCT { int id PK string name string description string category boolean availability } PRICE { int id PK int product_id FK float regular_price float discounted_price float discount_percentage string unit_price string promotion_type date promotion_start date promotion_end date last_updated int source_id FK } SOURCE { int id PK string name string url string logo datetime last_scraped } PRODUCT ||--o{ PRICE : has SOURCE ||--o{ PRICE : provides ``` ### Additional Database Fields We'll add these fields to handle the specific data format: - **Product**: Add `sourceProductId` to track original product IDs - **Price**: Add `vatIncluded` boolean flag since prices include VAT - **Source**: Add `lastUpdateTime` to track the "Последно ажурирање" timestamp ### Data Transformation Rules 1. **Text Processing** - Handle Cyrillic text encoding (UTF-8) - Parse product names and descriptions - Extract category from description field 2. **Price Processing** - Convert prices from string to float - Handle "ден/кг" unit price format - Store both VAT-included and VAT-excluded prices 3. **Date Processing** - Parse dates from "DD/MM/YYYY" format - Handle time in "HH:mm" format for last update - Store timestamps in UTC ### Scraper Implementation The scraper will process the HTML table structure: ```typescript interface RawProductData { productName: string; // "Назив на стока" regularPrice: string; // "Продажна цена (со ДДВ)" unitPrice: string; // "Единечна цена" availability: string; // "Достапност во продажен објект" description: string; // "Опис на стока" discountPrice: string; // "Цена со попуст" discountPercent: string; // "Попуст (%)" promotionType: string; // "Вид на продажно потикнување" promotionPeriod: string; // "Времетраење на промоција или попуст" } interface ProcessedProduct { name: string; description: string; category: string; // Extracted from description availability: boolean; prices: { regular: number; discounted: number | null; unit: { price: number; measurement: string; // "ден/кг", etc. }; }; promotion: { type: string; discountPercentage: number; startDate: Date; endDate: Date; } | null; } ``` ### HTML Parsing Strategy 1. **Table Structure** ```typescript const parseTable = async (html: string): Promise => { // Use cheerio or similar for HTML parsing // Target structure: table > tr > td // Skip header row (first row) // Handle Cyrillic encoding } ``` 2. **Data Extraction** ```typescript const extractProduct = (row: CheerioElement): RawProductData => { // Extract td contents // Clean and normalize text // Handle special characters } ``` 3. **Data Transformation** ```typescript const transformProduct = (raw: RawProductData): ProcessedProduct => { // Convert prices to numbers // Parse dates // Extract category // Convert availability to boolean } ``` ### NestJS Modules 1. **Scraper Module** - Service for each data source - HTML parsing utilities - Scheduling for regular updates - Error handling and retry logic 2. **Product Module** - Product entity and repository - CRUD operations - Search and filtering 3. **Price Module** - Price entity and repository - Price history tracking - Discount calculations 4. **Source Module** - Source entity and repository - Source metadata management 5. **API Module** - RESTful endpoints - GraphQL API (optional) - Authentication and rate limiting ## Frontend Structure (PWA) 1. **Core Components** - Product listing - Product details - Price comparison - Search and filters - Favorites/Watchlist 2. **PWA Features** - Offline support - Push notifications for price drops - App installation - Responsive design ## Implementation Plan ### Phase 1: Backend Setup 1. Initialize NestJS project 2. Set up PostgreSQL connection 3. Define database entities 4. Create basic API endpoints ### Phase 2: Scraper Implementation 1. Create scraper services for each source 2. Implement HTML parsing based on the provided structure 3. Set up scheduled scraping jobs 4. Implement data normalization and storage ### Phase 3: Frontend Development 1. Set up PWA framework 2. Implement core UI components 3. Connect to backend API 4. Implement offline functionality ### Phase 4: Testing & Deployment 1. Unit and integration testing 2. Performance optimization 3. Deployment setup 4. Monitoring and analytics ## Scraper Implementation Details Based on the HTML structure provided, here's how we'll parse the data: ```typescript interface ProductData { name: string; regularPrice: number; unitPrice: string; availability: boolean; description: string; discountedPrice: number | null; discountPercentage: number | null; promotionType: string | null; promotionPeriod: { start: Date | null; end: Date | null; }; lastUpdated: Date; source: string; } ``` The scraper will: 1. Fetch the HTML content 2. Parse the table structure 3. Extract data from each row 4. Transform dates and numeric values 5. Store normalized data in the database ## Data Extraction Process The HTML structure contains product information in a table format. Each row represents a product with the following columns: - Product name - Regular price (with VAT) - Unit price - Availability - Product description - Regular price (repeated) - Discounted price - Discount percentage - Type of promotion - Promotion duration The scraper will need to handle: - Text encoding (appears to be in Cyrillic) - Date parsing (format: DD/MM/YYYY) - Price conversion to numeric values - Availability conversion to boolean - Extracting promotion date ranges ## API Endpoints The backend will provide the following key API endpoints: 1. **Products** - `GET /products` - List all products with pagination - `GET /products/:id` - Get product details - `GET /products/search` - Search products by name/category 2. **Prices** - `GET /prices/product/:id` - Get all prices for a product - `GET /prices/compare/:ids` - Compare prices for multiple products - `GET /prices/history/:id` - Get price history for a product 3. **Sources** - `GET /sources` - List all data sources - `GET /sources/:id/products` - Get products from a specific source 4. **User Features** - `POST /watchlist` - Add product to watchlist - `GET /watchlist` - Get user's watchlist - `POST /notifications` - Configure price drop notifications