An AI System for Transforming Scattered Web Data into an Interactive, Queryable Knowledge Graph
Keywords:
Knowledge Graph, Semantic Search, Retrieval-Augmented Generation, Vector Embeddings, Large Language Model, FastAPI, pgvector, Personal Knowledge Management, SentenceTransformer, Cytoscape.Abstract
The exponential growth of online infor-mation
creates significant challenges for individuals organizing and
retrieving knowledge scattered across web sources. This
paper presents MindCanvas, an AI-driven system that
transforms unstructured browsing data into a coherent,
interactive knowledge graph supporting semantic search
and natural-language querying via large language models
(LLMs), dense vector embeddings, and retrievalaugmented
generation (RAG). The system processes web
pages through a Chrome extension, extracts topics using
GPT-4.1-nano, generates 384-dim embeddings with all-
MiniLM-L6-v2, stores them in PostgreSQL with
pgvector, and renders an interactive graph with
Cytoscape.js. Evaluations on a 1 GB dataset (2,500 pages)
yield topic extraction F1 of 0.81, MRR@5 of 0.74, and
graph modularity of 0.52. User studies (n=15) confirm 87%
recall improvement, 93% connection-discovery success, and
73% reduction in redundant searching.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Authors

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.










