7.3 KiB
Price Comparison PWA Solution Structure
This document outlines the comprehensive solution structure for a price comparison PWA with NestJS backend and PostgreSQL database.
System Architecture
graph TD
A[Web Scrapers] -->|Extract Data| B[NestJS Backend]
B -->|Store Data| C[PostgreSQL Database]
B -->|Serve API| D[PWA Frontend]
E[Users] -->|Use| D
Backend Structure
Database Schema
erDiagram
PRODUCT {
int id PK
string name
string description
string category
boolean availability
}
PRICE {
int id PK
int product_id FK
float regular_price
float discounted_price
float discount_percentage
string unit_price
string promotion_type
date promotion_start
date promotion_end
date last_updated
int source_id FK
}
SOURCE {
int id PK
string name
string url
string logo
datetime last_scraped
}
PRODUCT ||--o{ PRICE : has
SOURCE ||--o{ PRICE : provides
Additional Database Fields
We'll add these fields to handle the specific data format:
- Product: Add
sourceProductIdto track original product IDs - Price: Add
vatIncludedboolean flag since prices include VAT - Source: Add
lastUpdateTimeto track the "Последно ажурирање" timestamp
Data Transformation Rules
-
Text Processing
- Handle Cyrillic text encoding (UTF-8)
- Parse product names and descriptions
- Extract category from description field
-
Price Processing
- Convert prices from string to float
- Handle "ден/кг" unit price format
- Store both VAT-included and VAT-excluded prices
-
Date Processing
- Parse dates from "DD/MM/YYYY" format
- Handle time in "HH:mm" format for last update
- Store timestamps in UTC
Scraper Implementation
The scraper will process the HTML table structure:
interface RawProductData {
productName: string; // "Назив на стока"
regularPrice: string; // "Продажна цена (со ДДВ)"
unitPrice: string; // "Единечна цена"
availability: string; // "Достапност во продажен објект"
description: string; // "Опис на стока"
discountPrice: string; // "Цена со попуст"
discountPercent: string; // "Попуст (%)"
promotionType: string; // "Вид на продажно потикнување"
promotionPeriod: string; // "Времетраење на промоција или попуст"
}
interface ProcessedProduct {
name: string;
description: string;
category: string; // Extracted from description
availability: boolean;
prices: {
regular: number;
discounted: number | null;
unit: {
price: number;
measurement: string; // "ден/кг", etc.
};
};
promotion: {
type: string;
discountPercentage: number;
startDate: Date;
endDate: Date;
} | null;
}
HTML Parsing Strategy
-
Table Structure
const parseTable = async (html: string): Promise<RawProductData[]> => { // Use cheerio or similar for HTML parsing // Target structure: table > tr > td // Skip header row (first row) // Handle Cyrillic encoding } -
Data Extraction
const extractProduct = (row: CheerioElement): RawProductData => { // Extract td contents // Clean and normalize text // Handle special characters } -
Data Transformation
const transformProduct = (raw: RawProductData): ProcessedProduct => { // Convert prices to numbers // Parse dates // Extract category // Convert availability to boolean }
NestJS Modules
-
Scraper Module
- Service for each data source
- HTML parsing utilities
- Scheduling for regular updates
- Error handling and retry logic
-
Product Module
- Product entity and repository
- CRUD operations
- Search and filtering
-
Price Module
- Price entity and repository
- Price history tracking
- Discount calculations
-
Source Module
- Source entity and repository
- Source metadata management
-
API Module
- RESTful endpoints
- GraphQL API (optional)
- Authentication and rate limiting
Frontend Structure (PWA)
-
Core Components
- Product listing
- Product details
- Price comparison
- Search and filters
- Favorites/Watchlist
-
PWA Features
- Offline support
- Push notifications for price drops
- App installation
- Responsive design
Implementation Plan
Phase 1: Backend Setup
- Initialize NestJS project
- Set up PostgreSQL connection
- Define database entities
- Create basic API endpoints
Phase 2: Scraper Implementation
- Create scraper services for each source
- Implement HTML parsing based on the provided structure
- Set up scheduled scraping jobs
- Implement data normalization and storage
Phase 3: Frontend Development
- Set up PWA framework
- Implement core UI components
- Connect to backend API
- Implement offline functionality
Phase 4: Testing & Deployment
- Unit and integration testing
- Performance optimization
- Deployment setup
- Monitoring and analytics
Scraper Implementation Details
Based on the HTML structure provided, here's how we'll parse the data:
interface ProductData {
name: string;
regularPrice: number;
unitPrice: string;
availability: boolean;
description: string;
discountedPrice: number | null;
discountPercentage: number | null;
promotionType: string | null;
promotionPeriod: {
start: Date | null;
end: Date | null;
};
lastUpdated: Date;
source: string;
}
The scraper will:
- Fetch the HTML content
- Parse the table structure
- Extract data from each row
- Transform dates and numeric values
- Store normalized data in the database
Data Extraction Process
The HTML structure contains product information in a table format. Each row represents a product with the following columns:
- Product name
- Regular price (with VAT)
- Unit price
- Availability
- Product description
- Regular price (repeated)
- Discounted price
- Discount percentage
- Type of promotion
- Promotion duration
The scraper will need to handle:
- Text encoding (appears to be in Cyrillic)
- Date parsing (format: DD/MM/YYYY)
- Price conversion to numeric values
- Availability conversion to boolean
- Extracting promotion date ranges
API Endpoints
The backend will provide the following key API endpoints:
-
Products
GET /products- List all products with paginationGET /products/:id- Get product detailsGET /products/search- Search products by name/category
-
Prices
GET /prices/product/:id- Get all prices for a productGET /prices/compare/:ids- Compare prices for multiple productsGET /prices/history/:id- Get price history for a product
-
Sources
GET /sources- List all data sourcesGET /sources/:id/products- Get products from a specific source
-
User Features
POST /watchlist- Add product to watchlistGET /watchlist- Get user's watchlistPOST /notifications- Configure price drop notifications