IPE-24 Classroom
Production
IPE-24 Classroom: Building a Zero-Cost Operating System for a University Batch
Executive Summary
IPE-24 Classroom started as a class portal, but that description is too small for what the system actually became. The real problem was not "students need a website." The real problem was that one batch of students was coordinating academic life across too many inconsistent channels: verbal updates, WhatsApp threads, Google Drive folders, routine changes, exam notices, Discord messages, and Class Representative memory.
- One trusted source of truth for class state.
- Fast publishing across multiple communication channels.
- Strong enough guardrails that the wrong person, wrong request, or wrong automation could not corrupt the batch's academic workflow.
What follows is the real case study: the problem, the system I designed, every major feature, the technical hassle behind each layer, the security posture, and the recurring architectural patterns I learned to recognize.
The Problem I Was Actually Solving
At surface level, the product looks familiar: announcements, routine, resources, exams, chatbot, admin dashboard.
At system level, the product solves a much nastier problem: academic coordination in an environment where information changes frequently, trust is role-sensitive, and the communication surface is fragmented.
The class had several recurring pain points:
- Announcements were getting buried in chat streams.
- Routine changes were temporal, not permanent, so a simple timetable table was never enough.
- Files existed in Google Drive, but discoverability was weak and folder hygiene degraded over time.
- Exam information was time-sensitive and needed structured metadata, not just message text.
- The Class Representative needed to publish quickly from mobile, not from a laptop admin panel every time.
- Students kept asking the same contextual questions: what classes are today, what changed this week, where is the file, what exam is next.
The product therefore had to behave less like a content site and more like a lightweight academic operations platform.
That shift changed every design decision.
The First Important Realization: This Was Not a Website
The first serious architecture lesson was that the website could not be the system. It could only be one projection of the system.
The actual system had four planes:
| Plane | Responsibility | |---|---| | Authoritative state | PostgreSQL via Prisma for announcements, exams, routines, files, users, audit, knowledge, notifications | | Operator interface | Next.js admin panel plus Telegram command center for the CR | | Distribution | Website, push notifications, Discord, WhatsApp | | Derived acceleration | Redis cache, SSE invalidation, service worker cache, IndexedDB, AI context assembly |
Once I started thinking in those planes, the codebase got more coherent. Features stopped being isolated pages and became state transitions with downstream fan-out.
That mental model is the backbone of the whole product.
Architecture in Its Final Shape
The system lives in a monorepo, but operationally it is split between Vercel and a self-hosted VPS:
apps/web: Next.js 14 app serving the student UI, admin UI, and versioned API routes.apps/bot: WhatsApp delivery service using Baileys.services/telegram-bot: the CR's command center, including classification and approval flow.services/discord-bot: outbound Discord announcement delivery.services/discord-listener: inbound Discord ingestion for approved messages and knowledge capture.services/transcriber: Python FastAPI service for faster-whisper transcription and text embeddings.- PostgreSQL 16 +
pgvector: durable state plus semantic search substrate. - Redis: rate limiting, cache, pub/sub, and event fan-out.
The most important boundary is that the web app is the authority, while the bots are controlled writers through internal APIs. That means the bots do not mutate the database directly. They authenticate with an internal secret and write through narrowly scoped routes like:
/api/v1/internal/announcements/api/v1/internal/exams/api/v1/internal/routine/overrides/api/v1/internal/files
This was a deliberate decision. It kept business rules centralized, cache invalidation consistent, and auditing possible.
The Story of the Features
1. Announcements: The Simplest Feature That Was Not Actually Simple
Announcements look trivial until you ask what "published" means.
In this system, an announcement is not just text in a table. It carries:
- semantic type (
general,exam,routine_update,urgent,event,course_update) - author identity
- publication state
- downstream delivery state for WhatsApp and Discord
- optional course scoping through a join table
The deeper problem was delivery integrity. A post could originate from the admin panel or from Telegram. Either way, once persisted, it had to trigger cache invalidation, student visibility, and optionally cross-channel distribution without duplicating logic.
This is where a recurring pattern emerged:
ingest -> validate -> persist -> audit -> invalidate -> publish
I started seeing that same pipeline show up in almost every feature.
2. Routine: Modeling Time Correctly Was Harder Than Rendering It
The routine system taught me one of the biggest system design lessons in this project: temporal exceptions deserve first-class modeling.
If I had modeled the routine as one mutable timetable, I would have created a maintenance nightmare. Real class schedules are a mix of:
- stable baseline slots
- A/B week parity
- target-group-specific classes
- lab vs theory distinctions
- one-off overrides for cancellations, makeup classes, room shifts, or teacher changes
So the system split routine data into:
BaseRoutineRoutineOverrideRoutineWeek
That split matters.
BaseRoutine stores the canonical recurring schedule. RoutineOverride captures time-bounded deviations. RoutineWeek tracks calendar week semantics, including week type and skipped weeks. This kept the model resilient against the most common academic scheduling problem: exceptions outgrowing the original schedule.
The student UI only shows the result. The real engineering work is in preserving the distinction between stable truth and temporary mutation.
3. Exams and Assignments: Deadlines Need Structure, Not Messages
Exam tracking became its own bounded context because informal notices were not enough. Students needed:
- countdown visibility
- course linkage
- room metadata
- syllabus/instructions
- assignment-vs-exam distinction
- submission state per student
That is why Exam and AssignmentSubmission exist separately instead of treating everything as announcement text. Once data became structured, the system could support timelines, filters, upcoming windows, and student-specific submission states.
This is one of those places where "normalizing the domain" mattered more than adding UI polish.
4. Resource Library and Google Drive: Metadata Was the Real Product
The files feature looks like upload/download, but the real product is metadata discipline on top of external storage.
The source of bytes is Google Drive. The source of truth for discoverability is the database.
The file layer tracks:
driveIddriveUrl- optional
downloadUrl - MIME type and size
- course association
- uploader
- drive connection lineage
- folder and subfolder visibility rules
The actual hard problems were:
- dealing with multiple auth sources for Drive access
- supporting connected drives and shared-drive folders
- keeping subfolders soft-hidden instead of hard-deleted
- making uploads resumable
- avoiding local buffering when streaming to Drive
- preserving referential cleanup semantics when files are removed
This feature changed my understanding of integrations. The API client is easy. The lifecycle rules are the real engineering.
5. Admin Operations: CRUD Was the Least Interesting Part
The admin surface includes:
- announcements
- exams
- routine and overrides
- files
- shared drives
- courses
- users and roles
- audit log
- knowledge base
- settings
- Telegram bot config
The interesting part is not that these pages exist. It is that they all sit on top of the same permission spine and side-effect model.
Every admin mutation has to answer four questions:
- Is the actor allowed to do this?
- What exact fields are mutable?
- What dependent caches or client views become stale?
- What audit artifact must survive even if the UI changes later?
That is why centralized guards and audit logging mattered more than page generation speed.
6. Telegram Command Center: The Highest-Leverage Feature in the Whole Product
If I had to name the feature with the highest operational leverage, it would be the Telegram bot.
This is where the product stopped being a portal and became an operating system for class communication.
The CR can send text or voice. The bot transcribes if needed, classifies the content, sends back a preview, waits for explicit approval, and only then writes through internal APIs and triggers downstream fan-out.
That approval loop is not cosmetic. It is a trust boundary.
The flow is intentionally human-in-the-loop:
Telegram input -> optional whisper transcription -> AI classification/formatting -> preview -> human confirm -> internal API write -> invalidate cache -> distribute
This pattern reduced the blast radius of misclassification and made automation usable in a real academic workflow. Full automation would have been faster, but materially less trustworthy.
7. Discord, WhatsApp, and Inbound Automation
Outbound Discord and WhatsApp matter because the portal cannot assume students will check the web app first. The system therefore pushes updates to where attention already exists.
But there is also an inbound side: the Discord listener can observe configured channels and route approved content back into the web platform as structured knowledge or announcements.
That gave the system an interesting duality:
- website as source of truth
- chat platforms as both distribution channels and context collection surfaces
This is one of the first places where I began to recognize bidirectional integration as a category with very different failure modes from simple outbound automation.
8. Virtual CR: The Most Ambitious Feature, and the One That Taught Me the Most Humility
The AI Virtual CR was supposed to be a clean RAG feature: ingest documents, chunk them, embed them, retrieve relevant context, answer with Gemini, done.
Reality was messier.
The codebase clearly shows both the intended vector-search architecture and the operational compromises:
KnowledgeDocumentandKnowledgeChunkmodels- chunking with overlap
- pgvector similarity search
- embedding generation
- chunk caps to protect free-tier storage
- prompt-building with source grounding
- prompt-injection heuristics
- live context assembly from announcements, routine, exams, files, and course catalog
The deep lesson here was that retrieval quality is only one part of the problem. Production viability is shaped by:
- Vercel duration ceilings
- Gemini quota behavior
- embedding latency
- cold starts
- context-window budgeting
- prompt abuse
- user expectation mismatch
That is why the system evolved toward a hybrid model. Instead of relying only on vector recall, the chatbot can assemble live database context for high-frequency questions such as:
- what classes are today
- what exams are coming
- what changed in the routine
- where is a file
- what recent announcements exist
This was an architectural concession to reality, and a good one. A pure semantic retrieval pipeline is elegant on paper. A bounded live-context assistant is often more reliable under free-tier constraints.
9. Student Experience Features
The student-facing surface is broader than the obvious core pages. It includes:
- dashboard
- announcements
- routine
- exams
- resources
- chat
- polls
- study groups
- notifications
- search
- profile
- settings
Some of these are fully central to the product. Some are secondary but important because they reduce the need to leave the platform. Some are partially built and need more proof in live usage.
Polls and study groups are good examples. Technically they exist and are integrated into the role model and API structure. Product-wise, they are more experimental. I would rather say that honestly than oversell them in a portfolio.
10. The Small Features That Make the System Feel Complete
Some of the most important features are not architecturally glamorous, but they reduce friction enough that the system starts feeling cohesive:
- the dashboard compresses announcements, routine, and upcoming exams into one operational snapshot
- push notifications make the portal behave like an active channel instead of a passive destination
- notifications history gives students a second chance after missing a real-time update
- search reduces dependency on remembering where a file or post originally appeared
- profile and settings keep the identity layer useful instead of purely decorative
These are not "extra pages." They are the glue that reduces context switching.
11. Peripheral and Transitional Features
A mature case study should also acknowledge the edges:
- search exists as a surface, but needs clearer validation as a daily-use feature
- notifications complement push but may overlap in value
- 2FA and password flows exist, but the primary login path is still Google OAuth
- a Discord admin page exists without equivalent backend depth
- Google Sheets and n8n appear as evolutionary artifacts from earlier architecture phases
- feature reschedule exists as a specialized adjunct to the routine domain
These are not failures. They are evidence of a live system that evolved through real constraints rather than clean-room planning.
The Technical Hassles That Actually Shaped the Product
Free Tier Economics Affected Core Architecture
This product was designed under an aggressive cost constraint: recurring cost had to stay effectively zero.
That single constraint influenced:
- choice of Gemini free-tier usage patterns
- local transcription and embedding
- storage caps for knowledge chunks
- resumable uploads instead of heavyweight file proxying
- serverless-compatible cache choices
- selective reliance on VPS-hosted services
This is one of the strongest examples in the project of product constraints becoming architectural constraints.
Cache Invalidation Was a First-Class Problem
The app is not fast because Next.js is fast. It is fast because the data plane is layered:
- client SWR cache
- persistent browser cache
- service worker
- server ETags
- SSE invalidation
- Redis server cache
That is a serious system, not a frontend trick.
The important insight is that invalidation was not generic. It had to be domain-aware:
- announcements invalidate announcement feeds
- exams invalidate exam windows and dashboard projections
- routine overrides invalidate temporal views, not just raw routine records
- file changes can invalidate global resources or course-scoped subsets
- chat history invalidation is user-scoped
If I had treated caching as a library concern instead of a domain concern, the app would have become both stale and unpredictable.
Security Work Was Not an Afterthought
Security was not a later polish pass. It was embedded into the design because the system mixes personal data, privileged actors, internal service routes, and AI-driven behavior.
The meaningful controls include:
- strict
@iut-dhaka.edudomain gating for primary auth - multi-layer role enforcement in middleware, route guards, query scoping, and UI
- explicit field whitelisting to block mass assignment
- raw SQL discipline around pgvector queries
- XSS mitigation through sanitization and content policy discipline
- origin checks and CSRF protection on state-changing routes
- prompt-injection detection for chatbot input
- Redis-backed rate limiting
- internal-route secret headers for bot-to-web writes
- audit logs for every meaningful admin action
The more senior lesson was that security in a system like this is mostly about trust-boundary clarity. Once the boundaries are fuzzy, every feature starts leaking risk into the next one.
Production Safety Needed Runtime Guards, Not Good Intentions
One of the hardest lessons in any real system is that dangerous operations should not merely be discouraged; they should be technically blocked.
In this project, destructive database operations, especially anything capable of truncation or unscoped deletion, became a design-level concern. That is the right instinct. In a product with live student data, "be careful" is not a control. Environment guards are controls.
This also applies to test isolation. If a test can accidentally point at a production-like Supabase URL, the test harness is under-designed.
Drive Integration Was Mostly About Failure Modes
Google Drive looked easy at first and then expanded into one of the more complex subsystems in the codebase.
The real work was in:
- auth fallback strategy
- refresh token lineage
- service-account fallback
- multi-drive indexing
- public link semantics
- resumable upload sessions
- stream-oriented transfer
- delete ordering between Drive and DB
- subfolder visibility state
The pattern I learned here is that third-party integration bugs rarely come from the happy path. They come from the states in between: expired tokens, partial uploads, stale folder topology, revoked access, duplicate indexing, and inconsistent deletion order.
Security Model as a System Design Story
The easiest way to understand the security model is to treat each class of actor as potentially honest-but-bounded or malicious-with-opportunity:
- outsiders should not authenticate
- students should not escalate privilege
- admins should not exceed their scope
- bots should not bypass business rules
- AI should not become an unrestricted answer engine
- background automation should not become a silent corruption vector
That threat framing led to a layered design:
Identity and Access
- Google OAuth for domain-bound identity
- optional password and TOTP path for privileged admin access
- role hierarchy:
student,admin,super_admin - fresh role checks at session usage time, not just first login
Data Integrity
- Prisma as the default query surface
- tagged raw SQL only where vector operations require it
- audit logs as durable mutation history
- explicit mutation fields instead of blind request-body spread
Service Trust
- internal bot routes require shared secret headers
- bots do not write directly to the database
- the web app remains the policy enforcement point
Content and AI Safety
- sanitization for rendered rich text
- prompt-injection pattern detection
- refusal posture for off-domain questions
- bounded context sources for AI answers
From a system design perspective, the most important idea is not any single defense. It is the decision to avoid single-point trust.
Pattern Recognition: The Deeper Architecture Lessons
This project taught me to recognize repeating patterns much earlier.
Pattern 1: Most Features Were Just Specialized State Pipelines
Announcements, exams, routine overrides, files, and knowledge ingestion all eventually followed the same shape:
trusted input -> schema validation -> permission gate -> durable write -> audit -> cache invalidation -> downstream projection
Once I recognized that, implementation got simpler because I stopped treating every feature as unique.
Pattern 2: Temporal Exceptions Should Usually Be Separate Models
Routine changes were the clearest example, but the principle generalizes. Systems become fragile when transient exceptions are merged into baseline truth.
The fix is not smarter conditionals. The fix is better domain separation.
Pattern 3: AI Features Become Stable Only When the Non-AI Context Is Strong
The Virtual CR did not improve because the prompt got prettier. It improved when the surrounding system got better at:
- supplying current state
- limiting topic scope
- rejecting hostile input
- capping storage and latency
- choosing operationally cheap context paths
That is a strong pattern I now look for in every AI feature: model quality is downstream of systems quality.
Pattern 4: Caching Is Really a Consistency Strategy
Redis, SSE, ETags, service workers, and SWR are not performance decorations. Together they define how stale the user is allowed to be, for how long, and under what mutation events.
That framing made the caching layer far more coherent.
Pattern 5: Honest Documentation Is Part of System Design
The codebase contains fully active features, yellow-flag features, abandoned paths, and legacy artifacts. Calling those states accurately is part of engineering maturity. A system becomes easier to evolve when its ambiguity is documented rather than hidden.
What This Product Proved to Me
IPE-24 Classroom proved that I can design beyond interface-level development.
I did not just assemble pages. I had to reason about:
- trust boundaries
- temporal data modeling
- integration failure modes
- human-in-the-loop automation
- AI guardrails
- multi-channel delivery
- cache coherence
- role-based operational control
- free-tier systems economics
The most important outcome was not the number of features. It was that the system developed an internal logic. Each new capability either fit the architecture or exposed where the architecture was weak.
That is the shift from "self-taught person who can code" to "developer who can shape a system."
Final Reflection
If I had to summarize this case study in one sentence, it would be this:
I set out to build a class portal, but I ended up building a trust-aware, multi-runtime academic coordination system where the hardest engineering problems were not UI problems at all.
The portfolio value of this product is not that it has announcements, exams, files, or a chatbot. Many projects can list those features.
The real value is that this system forced me to confront the parts of software that only start to matter once features interact:
- what is the source of truth
- who is allowed to mutate it
- how changes propagate
- how you stay fast without lying
- how you automate without losing human control
- how you use AI without turning the product into a hallucination engine
That is the level at which I now try to think about software.