e7f2b240
tangwang
first commit
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
|
# OmniShopAgent
An autonomous multi-modal fashion shopping agent powered by **LangGraph** and **ReAct pattern**.
## Demo
π **[demo.pdf](./demo.pdf)**
## Overview
OmniShopAgent autonomously decides which tools to call, maintains conversation state, and determines when to respond. Built with **LangGraph**, it uses agentic patterns for intelligent product discovery.
**Key Features:**
- Autonomous tool selection and execution
- Multi-modal search (text + image)
- Conversational context awareness
- Real-time visual analysis
## Tech Stack
| Component | Technology |
|-----------|-----------|
| **Agent Framework** | LangGraph |
| **LLM** | any LLM supported by LangChain |
| **Text Embedding** | text-embedding-3-small |
| **Image Embedding** | CLIP ViT-B/32 |
| **Vector Database** | Milvus |
| **Frontend** | Streamlit |
| **Dataset** | Kaggle Fashion Products |
## Architecture
**Agent Flow:**
```mermaid
graph LR
START --> Agent
Agent -->|Has tool_calls| Tools
Agent -->|No tool_calls| END
Tools --> Agent
subgraph "Agent Node"
A[Receive Messages] --> B[LLM Reasoning]
B --> C{Need Tools?}
C -->|Yes| D[Generate tool_calls]
C -->|No| E[Generate Response]
end
subgraph "Tool Node"
F[Execute Tools] --> G[Return ToolMessage]
end
```
**Available Tools:**
- `search_products(query)` - Text-based semantic search
- `search_by_image(image_path)` - Visual similarity search
- `analyze_image_style(image_path)` - VLM style analysis
## Examples
**Text Search:**
```
User: "winter coats for women"
Agent: search_products("winter coats women") β Returns 5 products
```
**Image Upload:**
```
User: [uploads sneaker photo] "find similar"
Agent: search_by_image(path) β Returns visually similar shoes
```
**Style Analysis + Search:**
```
User: [uploads vintage jacket] "what style is this? find matching pants"
Agent: analyze_image_style(path) β "Vintage denim bomber..."
search_products("vintage pants casual") β Returns matching items
```
**Multi-turn Context:**
```
Turn 1: "show me red dresses"
Agent: search_products("red dresses") β Results
Turn 2: "make them formal"
Agent: [remembers context] β search_products("red formal dresses") β Results
```
**Complex Reasoning:**
```
User: [uploads office outfit] "I like the shirt but need something more casual"
Agent: analyze_image_style(path) β Extracts shirt details
search_products("casual shirt [color] [style]") β Returns casual alternatives
```
## Installation
**Prerequisites:**
- Python 3.12+ (LangChain 1.x θ¦ζ± Python 3.10+)
- OpenAI API Key
- Docker & Docker Compose
### 1. Setup Environment
```bash
# Clone and install dependencies
git clone <repository-url>
cd OmniShopAgent
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
# Configure environment variables
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY
```
### 2. Download Dataset
Download the [Fashion Product Images Dataset](https://www.kaggle.com/datasets/paramaggarwal/fashion-product-images-dataset) from Kaggle and extract to `./data/`:
```python
python scripts/download_dataset.py
```
Expected structure:
```
data/
βββ images/ # ~44k product images
βββ styles.csv # Product metadata
βββ images.csv # Image filenames
```
### 3. Start Services
```bash
docker-compose up
python -m clip_server
```
### 4. Index Data
```bash
python scripts/index_data.py
```
This generates and stores text/image embeddings for all 44k products in Milvus.
### 5. Launch Application
```bash
# δ½Ώη¨ε―ε¨θζ¬οΌζ¨θοΌ
./scripts/start.sh
# ζη΄ζ₯θΏθ‘
streamlit run app.py
```
Opens at `http://localhost:8501`
### CentOS 8 ι¨η½²
θ―¦θ§ [docs/DEPLOY_CENTOS8.md](docs/DEPLOY_CENTOS8.md)
|