They chose Node.js and Express.js for their speed and scalability in backend development. PostgreSQL was selected for its robust data management capabilities, essential for handling sensitive user data.
Swift was the natural choice for iOS app development due to its efficiency and native compatibility. UIKit and Swift UI frameworks were used for developing a responsive user interface. Additionally, Firebase and Core ML were incorporated for functionalities like social authentication and machine learning on iOS devices.
Recognizing the potential of AI in enhancing user interaction, shopdev planned the integration of GPT-3.5 Turbo and ChatGPT. This move was aimed at facilitating dynamic and intelligent avatar conversations, tapping into OpenAI's advanced language processing capabilities.
The TRD recommended using DigitalOcean Functions for efficient serverless operations and Spaces CDN for optimized content delivery. DigitalOcean's Cloud Firewalls were chosen to ensure robust protection against external threats, a crucial aspect of securing web applications.
ChatGPT stands at the forefront of our technology stack. Leveraging OpenAI's GPT-3.5 Turbo model, it facilitates engaging, context-aware conversations with users. The module excels in processing natural language, allowing the AI avatar to communicate effectively and respond with a high degree of relevance and personalization.
Beyond standard conversations, ChatGPT is adept at interpreting visual content and textual data. It can analyze images to extract pertinent information and interpret PDF documents, turning unstructured data into actionable insights. This feature enriches user interactions, enabling the AI avatar to discuss a wide range of topics and provide informed responses based on external data sources.
The Tortoise text-to-speech model plays a pivotal role in giving voice to the AI avatar. It converts textual responses generated by ChatGPT into natural-sounding speech, enhancing the realism of the avatar.
This module is known for its ability to generate high-quality, lifelike voice output. It offers customization options, such as adjusting tone and speech rate, allowing the avatar's voice to match its personality and the context of the conversation.
Apple's voice recognition technology is integrated to facilitate a two-way communication channel. It efficiently translates user's spoken words into text, enabling the appto understand and process user commands or queries spoken aloud.
Known for its high accuracy and speed, Apple's voice recognition ensures that user inputs are captured correctly and promptly, contributing to a fluid and intuitive user experience.
The Wav2Lip model is critical in syncing the avatar's lip movements with the spoken voice. This synchronization is essential for creating a believable and immersive interaction experience.
By accurately matching lip movements to the generated speech, Wav2Lip adds a layer of realism to the avatar, making conversations feel more natural and engaging. This visual coherence between speech and movement is crucial for maintaining the illusion of a live, interactive entity.
The PRD meticulously detailed the user onboarding process, including the calling features and various account settings. It catered to different user personas, such as free and paid users, ensuring the app's accessibility and appeal to a broad audience.
shopdev’s PRD outlined the app's aesthetic and functional design, focusing on user-friendliness. It included specifics on splash screens, avatar selection, and call functionalities, ensuring a seamless user experience.
The PRD defined the requirements for app store upload and accessibility. This included details on payment integration and the structure of in-app purchases, ensuring a straightforward and secure transaction process for users.