Async release

This commit is contained in:
Carlos Mesquita
2024-07-23 08:40:35 +01:00
parent a4caecdb4f
commit 3cf9fa5cba
116 changed files with 5609 additions and 30630 deletions

View File

@@ -5,3 +5,4 @@ README.md
*.pyd *.pyd
__pycache__ __pycache__
.pytest_cache .pytest_cache
postman

2
.env
View File

@@ -3,3 +3,5 @@ JWT_SECRET_KEY=6e9c124ba92e8814719dcb0f21200c8aa4d0f119a994ac5e06eb90a366c83ab2
JWT_TEST_TOKEN=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJ0ZXN0In0.Emrs2D3BmMP4b3zMjw0fJTPeyMwWEBDbxx2vvaWguO0 JWT_TEST_TOKEN=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJ0ZXN0In0.Emrs2D3BmMP4b3zMjw0fJTPeyMwWEBDbxx2vvaWguO0
GOOGLE_APPLICATION_CREDENTIALS=firebase-configs/storied-phalanx-349916.json GOOGLE_APPLICATION_CREDENTIALS=firebase-configs/storied-phalanx-349916.json
HEY_GEN_TOKEN=MjY4MDE0MjdjZmNhNDFmYTlhZGRkNmI3MGFlMzYwZDItMTY5NTExNzY3MA== HEY_GEN_TOKEN=MjY4MDE0MjdjZmNhNDFmYTlhZGRkNmI3MGFlMzYwZDItMTY5NTExNzY3MA==
GPT_ZERO_API_KEY=0195b9bb24c5439899f71230809c74af

2
.gitignore vendored
View File

@@ -2,3 +2,5 @@ __pycache__
.idea .idea
.env .env
.DS_Store .DS_Store
firebase-configs/local.json
.venv

3
.idea/misc.xml generated
View File

@@ -1,4 +1,7 @@
<?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="UTF-8"?>
<project version="4"> <project version="4">
<component name="ProjectRootManager" version="2" project-jdk-name="Python 3.9" project-jdk-type="Python SDK" /> <component name="ProjectRootManager" version="2" project-jdk-name="Python 3.9" project-jdk-type="Python SDK" />
<component name="PyCharmProfessionalAdvertiser">
<option name="shown" value="true" />
</component>
</project> </project>

View File

@@ -1,6 +1,10 @@
FROM python:3.11-slim as requirements-stage
WORKDIR /tmp
RUN pip install poetry
COPY pyproject.toml ./poetry.lock* /tmp/
RUN poetry export -f requirements.txt --output requirements.txt --without-hashes
# Use the official lightweight Python image.
# https://hub.docker.com/_/python
FROM python:3.11-slim FROM python:3.11-slim
# Allow statements and log messages to immediately appear in the logs # Allow statements and log messages to immediately appear in the logs
@@ -9,18 +13,25 @@ ENV PYTHONUNBUFFERED True
# Copy local code to the container image. # Copy local code to the container image.
ENV APP_HOME /app ENV APP_HOME /app
WORKDIR $APP_HOME WORKDIR $APP_HOME
COPY . ./ COPY . ./
COPY --from=requirements-stage /tmp/requirements.txt /app/requirements.txt
RUN apt update && apt install -y ffmpeg RUN apt update && apt install -y ffmpeg
# Install production dependencies. RUN pip install openai-whisper
RUN pip install --no-cache-dir -r requirements.txt
EXPOSE 5000 # openai-whisper model in not compatible with the newer 2.0.0 numpy release
RUN pip install --upgrade numpy<2
RUN pip install --no-cache-dir -r /app/requirements.txt
EXPOSE 8000
# Run the web service on container startup. Here we use the gunicorn # Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads. # webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers # For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available. # to be equal to the cores available.
# Timeout is set to 0 to disable the timeouts of the workers to allow Cloud Run to handle instance scaling. # Timeout is set to 0 to disable the timeouts of the workers to allow Cloud Run to handle instance scaling.
CMD exec gunicorn --bind 0.0.0.0:5000 --workers 1 --threads 8 --timeout 0 app:app CMD exec uvicorn --bind 0.0.0.0:8000 --workers 1 --threads 8 --timeout 0 app.server:app

72
README.md Normal file
View File

@@ -0,0 +1,72 @@
# Disclaimer
I didn't fully test all the endpoints, the main purpose of this release was for ielts-be to be async but I've also
separated logic through different layers, removed some duplication and implemented dependency injection, so there
could be errors and extensive testing is needed before even considering deploying (if you're even considering it).
The version this was refactored from was master's branch commit a4caecd 2024-06-13
# Changes
Since one of my use cases is load testing with 5000 concurrent users and ielts-be is sync, I've refactored ielts-be
into this fastapi app.
The ielts-be Dockerfile runs the container with:
```CMD exec gunicorn --bind 0.0.0.0:5000 --workers 1 --threads 8 --timeout 0 app:app```
And since gunicorn uses WSGI and ielts-be has mostly sync I/O blocking operations, everytime a request encounters
an I/O blocking operation a thread is blocked. Since this config is 1 worker with 8 threads, the container
will only be able to handle 8 concurrent requests at a time before gcloud run cold starts another instance.
Flask was built with WSGI in mind, having Quart as it's async alternative, even though you can serve Flask
with uvicorn using the [asgiref](https://pypi.org/project/asgiref/) adapter, FastAPI has better performance
than both alternatives and the sync calls would need to be modified either way.
# Endpoints
In ielts-ui I've added a wrapper to every backend request in '/src/utils/translate.backend.endpoints.ts' to use the
new endpoints if the "BACKEND_TYPE" environment variable is set to "async", if the env variable is not present or
with another value, the wrapper will return the old endpoint.
| Method | ielts-be | This one |
|--------|--------------------------------------|------------------------------------------|
| GET | /healthcheck | /api/healthcheck |
| GET | /listening_section_1 | /api/listening/section/1 |
| GET | /listening_section_2 | /api/listening/section/2 |
| GET | /listening_section_3 | /api/listening/section/3 |
| GET | /listening_section_4 | /api/listening/section/4 |
| POST | /listening | /api/listening |
| POST | /writing_task1 | /api/grade/writing/1 |
| POST | /writing_task2 | /api/grade/writing/2 |
| GET | /writing_task1_general | /api/writing/1 |
| GET | /writing_task2_general | /api/writing/2 |
| POST | /speaking_task_1 | /api/grade/speaking/1 |
| POST | /speaking_task_2 | /api/grade/speaking/2 |
| POST | /speaking_task_3 | /api/grade/speaking/3 |
| GET | /speaking_task_1 | /api/speaking/1 |
| GET | /speaking_task_2 | /api/speaking/2 |
| GET | /speaking_task_3 | /api/speaking/3 |
| POST | /speaking | /api/speaking |
| POST | /speaking/generate_speaking_video | /api/speaking/generate_speaking_video |
| POST | /speaking/generate_interactive_video | /api/speaking/generate_interactive_video |
| GET | /reading_passage_1 | /api/reading/passage/1 |
| GET | /reading_passage_2 | /api/reading/passage/2 |
| GET | /reading_passage_3 | /api/reading/passage/3 |
| GET | /level | /api/level |
| GET | /level_utas | /api/level/utas |
| POST | /fetch_tips | /api/training/tips |
| POST | /grading_summary | /api/grade/summary |
# Run the app
This is for Windows, creating venv and activating it may differ based on your OS
1. python -m venv env
2. env\Scripts\activate
3. pip install openai-whisper
4. pip install --upgrade numpy<2
5. pip install poetry
6. poetry install
7. python main.py

0
app/__init__.py Normal file
View File

18
app/api/__init__.py Normal file
View File

@@ -0,0 +1,18 @@
from fastapi import APIRouter
from .home import home_router
from .listening import listening_router
from .reading import reading_router
from .speaking import speaking_router
from .training import training_router
from .writing import writing_router
from .grade import grade_router
router = APIRouter()
router.include_router(home_router, prefix="/api", tags=["Home"])
router.include_router(listening_router, prefix="/api/listening", tags=["Listening"])
router.include_router(reading_router, prefix="/api/reading", tags=["Reading"])
router.include_router(speaking_router, prefix="/api/speaking", tags=["Speaking"])
router.include_router(writing_router, prefix="/api/writing", tags=["Writing"])
router.include_router(grade_router, prefix="/api/grade", tags=["Grade"])
router.include_router(training_router, prefix="/api/training", tags=["Training"])

49
app/api/grade.py Normal file
View File

@@ -0,0 +1,49 @@
from dependency_injector.wiring import inject, Provide
from fastapi import APIRouter, Depends, Path, Request
from app.controllers.abc import IGradeController
from app.dtos import WritingGradeTaskDTO
from app.middlewares import Authorized, IsAuthenticatedViaBearerToken
controller = "grade_controller"
grade_router = APIRouter()
@grade_router.post(
'/writing/{task}',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def grade_writing_task(
data: WritingGradeTaskDTO,
task: int = Path(..., ge=1, le=2),
grade_controller: IGradeController = Depends(Provide[controller])
):
return await grade_controller.grade_writing_task(task, data)
@grade_router.post(
'/speaking/{task}',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def grade_speaking_task(
request: Request,
task: int = Path(..., ge=1, le=3),
grade_controller: IGradeController = Depends(Provide[controller])
):
data = await request.json()
return await grade_controller.grade_speaking_task(task, data)
@grade_router.post(
'/summary',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def grading_summary(
request: Request,
grade_controller: IGradeController = Depends(Provide[controller])
):
data = await request.json()
return await grade_controller.grading_summary(data)

9
app/api/home.py Normal file
View File

@@ -0,0 +1,9 @@
from fastapi import APIRouter
home_router = APIRouter()
@home_router.get(
'/healthcheck'
)
async def healthcheck():
return {"healthy": True}

30
app/api/level.py Normal file
View File

@@ -0,0 +1,30 @@
from dependency_injector.wiring import Provide, inject
from fastapi import APIRouter, Depends
from app.middlewares import Authorized, IsAuthenticatedViaBearerToken
from app.controllers.abc import ILevelController
controller = "level_controller"
level_router = APIRouter()
@level_router.get(
'/',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def get_level_exam(
level_controller: ILevelController = Depends(Provide[controller])
):
return await level_controller.get_level_exam()
@level_router.get(
'/utas',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def get_level_utas(
level_controller: ILevelController = Depends(Provide[controller])
):
return await level_controller.get_level_exam()

40
app/api/listening.py Normal file
View File

@@ -0,0 +1,40 @@
import random
from dependency_injector.wiring import Provide, inject
from fastapi import APIRouter, Depends, Path
from app.middlewares import Authorized, IsAuthenticatedViaBearerToken
from app.controllers.abc import IListeningController
from app.configs.constants import EducationalContent
from app.dtos import SaveListeningDTO
controller = "listening_controller"
listening_router = APIRouter()
@listening_router.get(
'/section/{section}',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def get_listening_question(
exercises: list[str],
section: int = Path(..., ge=1, le=4),
topic: str | None = None,
difficulty: str = random.choice(EducationalContent.DIFFICULTIES),
listening_controller: IListeningController = Depends(Provide[controller])
):
return await listening_controller.get_listening_question(section, topic, exercises, difficulty)
@listening_router.post(
'/',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def save_listening(
data: SaveListeningDTO,
listening_controller: IListeningController = Depends(Provide[controller])
):
return await listening_controller.save_listening(data)

28
app/api/reading.py Normal file
View File

@@ -0,0 +1,28 @@
import random
from dependency_injector.wiring import Provide, inject
from fastapi import APIRouter, Depends, Path, Query
from app.middlewares import Authorized, IsAuthenticatedViaBearerToken
from app.configs.constants import EducationalContent
from app.controllers.abc import IReadingController
controller = "reading_controller"
reading_router = APIRouter()
@reading_router.get(
'/passage/{passage}',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def get_reading_passage(
passage: int = Path(..., ge=1, le=3),
topic: str = Query(default=random.choice(EducationalContent.TOPICS)),
exercises: list[str] = Query(default=[]),
difficulty: str = Query(default=random.choice(EducationalContent.DIFFICULTIES)),
reading_controller: IReadingController = Depends(Provide[controller])
):
return await reading_controller.get_reading_passage(passage, topic, exercises, difficulty)

63
app/api/speaking.py Normal file
View File

@@ -0,0 +1,63 @@
import random
from dependency_injector.wiring import inject, Provide
from fastapi import APIRouter, Path, Query, Depends, BackgroundTasks
from app.middlewares import Authorized, IsAuthenticatedViaBearerToken
from app.configs.constants import EducationalContent
from app.controllers.abc import ISpeakingController
from app.dtos import SaveSpeakingDTO, SpeakingGenerateVideoDTO, SpeakingGenerateInteractiveVideoDTO
controller = "speaking_controller"
speaking_router = APIRouter()
@speaking_router.get(
'/{task}',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def get_speaking_task(
task: int = Path(..., ge=1, le=3),
topic: str = Query(default=random.choice(EducationalContent.MTI_TOPICS)),
difficulty: str = Query(default=random.choice(EducationalContent.DIFFICULTIES)),
speaking_controller: ISpeakingController = Depends(Provide[controller])
):
return await speaking_controller.get_speaking_task(task, topic, difficulty)
@speaking_router.post(
'/',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def save_speaking(
data: SaveSpeakingDTO,
background_tasks: BackgroundTasks,
speaking_controller: ISpeakingController = Depends(Provide[controller])
):
return await speaking_controller.save_speaking(data, background_tasks)
@speaking_router.post(
'/generate_speaking_video',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def generate_speaking_video(
data: SpeakingGenerateVideoDTO,
speaking_controller: ISpeakingController = Depends(Provide[controller])
):
return await speaking_controller.generate_speaking_video(data)
@speaking_router.post(
'/generate_interactive_video',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def generate_interactive_video(
data: SpeakingGenerateInteractiveVideoDTO,
speaking_controller: ISpeakingController = Depends(Provide[controller])
):
return await speaking_controller.generate_interactive_video(data)

21
app/api/training.py Normal file
View File

@@ -0,0 +1,21 @@
from dependency_injector.wiring import Provide, inject
from fastapi import APIRouter, Depends
from app.dtos import TipsDTO
from app.middlewares import Authorized, IsAuthenticatedViaBearerToken
from app.controllers.abc import ITrainingController
controller = "training_controller"
training_router = APIRouter()
@training_router.post(
'/tips',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def get_reading_passage(
data: TipsDTO,
training_controller: ITrainingController = Depends(Provide[controller])
):
return await training_controller.fetch_tips(data)

25
app/api/writing.py Normal file
View File

@@ -0,0 +1,25 @@
import random
from dependency_injector.wiring import inject, Provide
from fastapi import APIRouter, Path, Query, Depends
from app.middlewares import Authorized, IsAuthenticatedViaBearerToken
from app.configs.constants import EducationalContent
from app.controllers.abc import IWritingController
controller = "writing_controller"
writing_router = APIRouter()
@writing_router.get(
'/{task}',
dependencies=[Depends(Authorized([IsAuthenticatedViaBearerToken]))]
)
@inject
async def get_writing_task_general_question(
task: int = Path(..., ge=1, le=2),
topic: str = Query(default=random.choice(EducationalContent.MTI_TOPICS)),
difficulty: str = Query(default=random.choice(EducationalContent.DIFFICULTIES)),
writing_controller: IWritingController = Depends(Provide[controller])
):
return await writing_controller.get_writing_task_general_question(task, topic, difficulty)

5
app/configs/__init__.py Normal file
View File

@@ -0,0 +1,5 @@
from .dependency_injection import config_di
__all__ = [
"config_di"
]

706
app/configs/constants.py Normal file
View File

@@ -0,0 +1,706 @@
from enum import Enum
BLACKLISTED_WORDS = ["jesus", "sex", "gay", "lesbian", "homosexual", "god", "angel", "pornography", "beer", "wine",
"cocaine", "alcohol", "nudity", "lgbt", "casino", "gambling", "catholicism",
"discrimination", "politics", "politic", "christianity", "islam", "christian", "christians",
"jews", "jew", "discrimination", "discriminatory"]
class ExamVariant(Enum):
FULL = "full"
PARTIAL = "partial"
class QuestionType(Enum):
LISTENING_SECTION_1 = "Listening Section 1"
LISTENING_SECTION_2 = "Listening Section 2"
LISTENING_SECTION_3 = "Listening Section 3"
LISTENING_SECTION_4 = "Listening Section 4"
WRITING_TASK_1 = "Writing Task 1"
WRITING_TASK_2 = "Writing Task 2"
SPEAKING_1 = "Speaking Task Part 1"
SPEAKING_2 = "Speaking Task Part 2"
READING_PASSAGE_1 = "Reading Passage 1"
READING_PASSAGE_2 = "Reading Passage 2"
READING_PASSAGE_3 = "Reading Passage 3"
class AvatarEnum(Enum):
MATTHEW_NOAH = "5912afa7c77c47d3883af3d874047aaf"
VERA_CERISE = "9e58d96a383e4568a7f1e49df549e0e4"
EDWARD_TONY = "d2cdd9c0379a4d06ae2afb6e5039bd0c"
TANYA_MOLLY = "045cb5dcd00042b3a1e4f3bc1c12176b"
KAYLA_ABBI = "1ae1e5396cc444bfad332155fdb7a934"
JEROME_RYAN = "0ee6aa7cc1084063a630ae514fccaa31"
TYLER_CHRISTOPHER = "5772cff935844516ad7eeff21f839e43"
class FilePaths:
AUDIO_FILES_PATH = 'download-audio/'
FIREBASE_LISTENING_AUDIO_FILES_PATH = 'listening_recordings/'
VIDEO_FILES_PATH = 'download-video/'
FIREBASE_SPEAKING_VIDEO_FILES_PATH = 'speaking_videos/'
class TemperatureSettings:
GRADING_TEMPERATURE = 0.1
TIPS_TEMPERATURE = 0.2
GEN_QUESTION_TEMPERATURE = 0.7
class GPTModels:
GPT_3_5_TURBO = "gpt-3.5-turbo"
GPT_4_TURBO = "gpt-4-turbo"
GPT_4_O = "gpt-4o"
GPT_3_5_TURBO_16K = "gpt-3.5-turbo-16k"
GPT_3_5_TURBO_INSTRUCT = "gpt-3.5-turbo-instruct"
GPT_4_PREVIEW = "gpt-4-turbo-preview"
class FieldsAndExercises:
GRADING_FIELDS = ['comment', 'overall', 'task_response']
GEN_FIELDS = ['topic']
GEN_TEXT_FIELDS = ['title']
LISTENING_GEN_FIELDS = ['transcript', 'exercise']
READING_EXERCISE_TYPES = ['fillBlanks', 'writeBlanks', 'trueFalse', 'paragraphMatch']
LISTENING_EXERCISE_TYPES = ['multipleChoice', 'writeBlanksQuestions', 'writeBlanksFill', 'writeBlanksForm']
TOTAL_READING_PASSAGE_1_EXERCISES = 13
TOTAL_READING_PASSAGE_2_EXERCISES = 13
TOTAL_READING_PASSAGE_3_EXERCISES = 14
TOTAL_LISTENING_SECTION_1_EXERCISES = 10
TOTAL_LISTENING_SECTION_2_EXERCISES = 10
TOTAL_LISTENING_SECTION_3_EXERCISES = 10
TOTAL_LISTENING_SECTION_4_EXERCISES = 10
class MinTimers:
LISTENING_MIN_TIMER_DEFAULT = 30
WRITING_MIN_TIMER_DEFAULT = 60
SPEAKING_MIN_TIMER_DEFAULT = 14
class Voices:
EN_US_VOICES = [
{'Gender': 'Female', 'Id': 'Salli', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Salli',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Male', 'Id': 'Matthew', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Matthew',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Female', 'Id': 'Kimberly', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Kimberly',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Female', 'Id': 'Kendra', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Kendra',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Male', 'Id': 'Justin', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Justin',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Male', 'Id': 'Joey', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Joey',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Female', 'Id': 'Joanna', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Joanna',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Female', 'Id': 'Ivy', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Ivy',
'SupportedEngines': ['neural', 'standard']}]
EN_GB_VOICES = [
{'Gender': 'Female', 'Id': 'Emma', 'LanguageCode': 'en-GB', 'LanguageName': 'British English', 'Name': 'Emma',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Male', 'Id': 'Brian', 'LanguageCode': 'en-GB', 'LanguageName': 'British English', 'Name': 'Brian',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Female', 'Id': 'Amy', 'LanguageCode': 'en-GB', 'LanguageName': 'British English', 'Name': 'Amy',
'SupportedEngines': ['neural', 'standard']}]
EN_GB_WLS_VOICES = [
{'Gender': 'Male', 'Id': 'Geraint', 'LanguageCode': 'en-GB-WLS', 'LanguageName': 'Welsh English', 'Name': 'Geraint',
'SupportedEngines': ['standard']}]
EN_AU_VOICES = [{'Gender': 'Male', 'Id': 'Russell', 'LanguageCode': 'en-AU', 'LanguageName': 'Australian English',
'Name': 'Russell', 'SupportedEngines': ['standard']},
{'Gender': 'Female', 'Id': 'Nicole', 'LanguageCode': 'en-AU', 'LanguageName': 'Australian English',
'Name': 'Nicole', 'SupportedEngines': ['standard']}]
ALL_VOICES = EN_US_VOICES + EN_GB_VOICES + EN_GB_WLS_VOICES + EN_AU_VOICES
MALE_VOICES = [item for item in ALL_VOICES if item.get('Gender') == 'Male']
FEMALE_VOICES = [item for item in ALL_VOICES if item.get('Gender') == 'Female']
class NeuralVoices:
NEURAL_EN_US_VOICES = [
{'Gender': 'Female', 'Id': 'Danielle', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Danielle',
'SupportedEngines': ['neural']},
{'Gender': 'Male', 'Id': 'Gregory', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Gregory',
'SupportedEngines': ['neural']},
{'Gender': 'Male', 'Id': 'Kevin', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Kevin',
'SupportedEngines': ['neural']},
{'Gender': 'Female', 'Id': 'Ruth', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Ruth',
'SupportedEngines': ['neural']},
{'Gender': 'Male', 'Id': 'Stephen', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Stephen',
'SupportedEngines': ['neural']}]
NEURAL_EN_GB_VOICES = [
{'Gender': 'Male', 'Id': 'Arthur', 'LanguageCode': 'en-GB', 'LanguageName': 'British English', 'Name': 'Arthur',
'SupportedEngines': ['neural']}]
NEURAL_EN_AU_VOICES = [
{'Gender': 'Female', 'Id': 'Olivia', 'LanguageCode': 'en-AU', 'LanguageName': 'Australian English',
'Name': 'Olivia', 'SupportedEngines': ['neural']}]
NEURAL_EN_ZA_VOICES = [
{'Gender': 'Female', 'Id': 'Ayanda', 'LanguageCode': 'en-ZA', 'LanguageName': 'South African English',
'Name': 'Ayanda', 'SupportedEngines': ['neural']}]
NEURAL_EN_NZ_VOICES = [
{'Gender': 'Female', 'Id': 'Aria', 'LanguageCode': 'en-NZ', 'LanguageName': 'New Zealand English', 'Name': 'Aria',
'SupportedEngines': ['neural']}]
NEURAL_EN_IN_VOICES = [
{'Gender': 'Female', 'Id': 'Kajal', 'LanguageCode': 'en-IN', 'LanguageName': 'Indian English', 'Name': 'Kajal',
'SupportedEngines': ['neural']}]
NEURAL_EN_IE_VOICES = [
{'Gender': 'Female', 'Id': 'Niamh', 'LanguageCode': 'en-IE', 'LanguageName': 'Irish English', 'Name': 'Niamh',
'SupportedEngines': ['neural']}]
ALL_NEURAL_VOICES = NEURAL_EN_US_VOICES + NEURAL_EN_GB_VOICES + NEURAL_EN_AU_VOICES + NEURAL_EN_ZA_VOICES + NEURAL_EN_NZ_VOICES + NEURAL_EN_IE_VOICES
MALE_NEURAL_VOICES = [item for item in ALL_NEURAL_VOICES if item.get('Gender') == 'Male']
FEMALE_NEURAL_VOICES = [item for item in ALL_NEURAL_VOICES if item.get('Gender') == 'Female']
class EducationalContent:
DIFFICULTIES = ["easy", "medium", "hard"]
MTI_TOPICS = [
"Education",
"Technology",
"Environment",
"Health and Fitness",
"Engineering",
"Work and Careers",
"Travel and Tourism",
"Culture and Traditions",
"Social Issues",
"Arts and Entertainment",
"Climate Change",
"Social Media",
"Sustainable Development",
"Health Care",
"Immigration",
"Artificial Intelligence",
"Consumerism",
"Online Shopping",
"Energy",
"Oil and Gas",
"Poverty and Inequality",
"Cultural Diversity",
"Democracy and Governance",
"Mental Health",
"Ethics and Morality",
"Population Growth",
"Science and Innovation",
"Poverty Alleviation",
"Cybersecurity and Privacy",
"Human Rights",
"Food and Agriculture",
"Cyberbullying and Online Safety",
"Linguistic Diversity",
"Urbanization",
"Artificial Intelligence in Education",
"Youth Empowerment",
"Disaster Management",
"Mental Health Stigma",
"Internet Censorship",
"Sustainable Fashion",
"Indigenous Rights",
"Water Scarcity",
"Social Entrepreneurship",
"Privacy in the Digital Age",
"Sustainable Transportation",
"Gender Equality",
"Automation and Job Displacement",
"Digital Divide",
"Education Inequality"
]
TOPICS = [
"Art and Creativity",
"History of Ancient Civilizations",
"Environmental Conservation",
"Space Exploration",
"Artificial Intelligence",
"Climate Change",
"World Religions",
"The Human Brain",
"Renewable Energy",
"Cultural Diversity",
"Modern Technology Trends",
"Sustainable Agriculture",
"Natural Disasters",
"Cybersecurity",
"Philosophy of Ethics",
"Robotics",
"Health and Wellness",
"Literature and Classics",
"World Geography",
"Social Media Impact",
"Food Sustainability",
"Economics and Markets",
"Human Evolution",
"Political Systems",
"Mental Health Awareness",
"Quantum Physics",
"Biodiversity",
"Education Reform",
"Animal Rights",
"The Industrial Revolution",
"Future of Work",
"Film and Cinema",
"Genetic Engineering",
"Climate Policy",
"Space Travel",
"Renewable Energy Sources",
"Cultural Heritage Preservation",
"Modern Art Movements",
"Sustainable Transportation",
"The History of Medicine",
"Artificial Neural Networks",
"Climate Adaptation",
"Philosophy of Existence",
"Augmented Reality",
"Yoga and Meditation",
"Literary Genres",
"World Oceans",
"Social Networking",
"Sustainable Fashion",
"Prehistoric Era",
"Democracy and Governance",
"Postcolonial Literature",
"Geopolitics",
"Psychology and Behavior",
"Nanotechnology",
"Endangered Species",
"Education Technology",
"Renaissance Art",
"Renewable Energy Policy",
"Modern Architecture",
"Climate Resilience",
"Artificial Life",
"Fitness and Nutrition",
"Classic Literature Adaptations",
"Ethical Dilemmas",
"Internet of Things (IoT)",
"Meditation Practices",
"Literary Symbolism",
"Marine Conservation",
"Sustainable Tourism",
"Ancient Philosophy",
"Cold War Era",
"Behavioral Economics",
"Space Colonization",
"Clean Energy Initiatives",
"Cultural Exchange",
"Modern Sculpture",
"Climate Mitigation",
"Mindfulness",
"Literary Criticism",
"Wildlife Conservation",
"Renewable Energy Innovations",
"History of Mathematics",
"Human-Computer Interaction",
"Global Health",
"Cultural Appropriation",
"Traditional cuisine and culinary arts",
"Local music and dance traditions",
"History of the region and historical landmarks",
"Traditional crafts and artisanal skills",
"Wildlife and conservation efforts",
"Local sports and athletic competitions",
"Fashion trends and clothing styles",
"Education systems and advancements",
"Healthcare services and medical innovations",
"Family values and social dynamics",
"Travel destinations and tourist attractions",
"Environmental sustainability projects",
"Technological developments and innovations",
"Entrepreneurship and business ventures",
"Youth empowerment initiatives",
"Art exhibitions and cultural events",
"Philanthropy and community development projects"
]
TWO_PEOPLE_SCENARIOS = [
"Booking a table at a restaurant",
"Making a doctor's appointment",
"Asking for directions to a tourist attraction",
"Inquiring about public transportation options",
"Discussing weekend plans with a friend",
"Ordering food at a café",
"Renting a bicycle for a day",
"Arranging a meeting with a colleague",
"Talking to a real estate agent about renting an apartment",
"Discussing travel plans for an upcoming vacation",
"Checking the availability of a hotel room",
"Talking to a car rental service",
"Asking for recommendations at a library",
"Inquiring about opening hours at a museum",
"Discussing the weather forecast",
"Shopping for groceries",
"Renting a movie from a video store",
"Booking a flight ticket",
"Discussing a school assignment with a classmate",
"Making a reservation for a spa appointment",
"Talking to a customer service representative about a product issue",
"Discussing household chores with a family member",
"Planning a surprise party for a friend",
"Talking to a coworker about a project deadline",
"Inquiring about a gym membership",
"Discussing the menu options at a fast-food restaurant",
"Talking to a neighbor about a community event",
"Asking for help with computer problems",
"Discussing a recent sports game with a sports enthusiast",
"Talking to a pet store employee about buying a pet",
"Asking for information about a local farmer's market",
"Discussing the details of a home renovation project",
"Talking to a coworker about office supplies",
"Making plans for a family picnic",
"Inquiring about admission requirements at a university",
"Discussing the features of a new smartphone with a salesperson",
"Talking to a mechanic about car repairs",
"Making arrangements for a child's birthday party",
"Discussing a new diet plan with a nutritionist",
"Asking for information about a music concert",
"Talking to a hairdresser about getting a haircut",
"Inquiring about a language course at a language school",
"Discussing plans for a weekend camping trip",
"Talking to a bank teller about opening a new account",
"Ordering a drink at a coffee shop",
"Discussing a new book with a book club member",
"Talking to a librarian about library services",
"Asking for advice on finding a job",
"Discussing plans for a garden makeover with a landscaper",
"Talking to a travel agent about a cruise vacation",
"Inquiring about a fitness class at a gym",
"Ordering flowers for a special occasion",
"Discussing a new exercise routine with a personal trainer",
"Talking to a teacher about a child's progress in school",
"Asking for information about a local art exhibition",
"Discussing a home improvement project with a contractor",
"Talking to a babysitter about childcare arrangements",
"Making arrangements for a car service appointment",
"Inquiring about a photography workshop at a studio",
"Discussing plans for a family reunion with a relative",
"Talking to a tech support representative about computer issues",
"Asking for recommendations on pet grooming services",
"Discussing weekend plans with a significant other",
"Talking to a counselor about personal issues",
"Inquiring about a music lesson with a music teacher",
"Ordering a pizza for delivery",
"Making a reservation for a taxi",
"Discussing a new recipe with a chef",
"Talking to a fitness trainer about weight loss goals",
"Inquiring about a dance class at a dance studio",
"Ordering a meal at a food truck",
"Discussing plans for a weekend getaway with a partner",
"Talking to a florist about wedding flower arrangements",
"Asking for advice on home decorating",
"Discussing plans for a charity fundraiser event",
"Talking to a pet sitter about taking care of pets",
"Making arrangements for a spa day with a friend",
"Asking for recommendations on home improvement stores",
"Discussing weekend plans with a travel enthusiast",
"Talking to a car mechanic about car maintenance",
"Inquiring about a cooking class at a culinary school",
"Ordering a sandwich at a deli",
"Discussing plans for a family holiday party",
"Talking to a personal assistant about organizing tasks",
"Asking for information about a local theater production",
"Discussing a new DIY project with a home improvement expert",
"Talking to a wine expert about wine pairing",
"Making arrangements for a pet adoption",
"Asking for advice on planning a wedding"
]
SOCIAL_MONOLOGUE_CONTEXTS = [
"A guided tour of a historical museum",
"An introduction to a new city for tourists",
"An orientation session for new university students",
"A safety briefing for airline passengers",
"An explanation of the process of recycling",
"A lecture on the benefits of a healthy diet",
"A talk on the importance of time management",
"A monologue about wildlife conservation",
"An overview of local public transportation options",
"A presentation on the history of cinema",
"An introduction to the art of photography",
"A discussion about the effects of climate change",
"An overview of different types of cuisine",
"A lecture on the principles of financial planning",
"A monologue about sustainable energy sources",
"An explanation of the process of online shopping",
"A guided tour of a botanical garden",
"An introduction to a local wildlife sanctuary",
"A safety briefing for hikers in a national park",
"A talk on the benefits of physical exercise",
"A lecture on the principles of effective communication",
"A monologue about the impact of social media",
"An overview of the history of a famous landmark",
"An introduction to the world of fashion design",
"A discussion about the challenges of global poverty",
"An explanation of the process of organic farming",
"A presentation on the history of space exploration",
"An overview of traditional music from different cultures",
"A lecture on the principles of effective leadership",
"A monologue about the influence of technology",
"A guided tour of a famous archaeological site",
"An introduction to a local wildlife rehabilitation center",
"A safety briefing for visitors to a science museum",
"A talk on the benefits of learning a new language",
"A lecture on the principles of architectural design",
"A monologue about the impact of renewable energy",
"An explanation of the process of online banking",
"A presentation on the history of a famous art movement",
"An overview of traditional clothing from various regions",
"A lecture on the principles of sustainable agriculture",
"A discussion about the challenges of urban development",
"A monologue about the influence of social norms",
"A guided tour of a historical battlefield",
"An introduction to a local animal shelter",
"A safety briefing for participants in a charity run",
"A talk on the benefits of community involvement",
"A lecture on the principles of sustainable tourism",
"A monologue about the impact of alternative medicine",
"An explanation of the process of wildlife tracking",
"A presentation on the history of a famous inventor",
"An overview of traditional dance forms from different cultures",
"A lecture on the principles of ethical business practices",
"A discussion about the challenges of healthcare access",
"A monologue about the influence of cultural traditions",
"A guided tour of a famous lighthouse",
"An introduction to a local astronomy observatory",
"A safety briefing for participants in a team-building event",
"A talk on the benefits of volunteering",
"A lecture on the principles of wildlife protection",
"A monologue about the impact of space exploration",
"An explanation of the process of wildlife photography",
"A presentation on the history of a famous musician",
"An overview of traditional art forms from different cultures",
"A lecture on the principles of effective education",
"A discussion about the challenges of sustainable development",
"A monologue about the influence of cultural diversity",
"A guided tour of a famous national park",
"An introduction to a local marine conservation project",
"A safety briefing for participants in a hot air balloon ride",
"A talk on the benefits of cultural exchange programs",
"A lecture on the principles of wildlife conservation",
"A monologue about the impact of technological advancements",
"An explanation of the process of wildlife rehabilitation",
"A presentation on the history of a famous explorer",
"A lecture on the principles of effective marketing",
"A discussion about the challenges of environmental sustainability",
"A monologue about the influence of social entrepreneurship",
"A guided tour of a famous historical estate",
"An introduction to a local marine life research center",
"A safety briefing for participants in a zip-lining adventure",
"A talk on the benefits of cultural preservation",
"A lecture on the principles of wildlife ecology",
"A monologue about the impact of space technology",
"An explanation of the process of wildlife conservation",
"A presentation on the history of a famous scientist",
"An overview of traditional crafts and artisans from different cultures",
"A lecture on the principles of effective intercultural communication"
]
FOUR_PEOPLE_SCENARIOS = [
"A university lecture on history",
"A physics class discussing Newton's laws",
"A medical school seminar on anatomy",
"A training session on computer programming",
"A business school lecture on marketing strategies",
"A chemistry lab experiment and discussion",
"A language class practicing conversational skills",
"A workshop on creative writing techniques",
"A high school math lesson on calculus",
"A training program for customer service representatives",
"A lecture on environmental science and sustainability",
"A psychology class exploring human behavior",
"A music theory class analyzing compositions",
"A nursing school simulation for patient care",
"A computer science class on algorithms",
"A workshop on graphic design principles",
"A law school lecture on constitutional law",
"A geology class studying rock formations",
"A vocational training program for electricians",
"A history seminar focusing on ancient civilizations",
"A biology class dissecting specimens",
"A financial literacy course for adults",
"A literature class discussing classic novels",
"A training session for emergency response teams",
"A sociology lecture on social inequality",
"An art class exploring different painting techniques",
"A medical school seminar on diagnosis",
"A programming bootcamp teaching web development",
"An economics class analyzing market trends",
"A chemistry lab experiment on chemical reactions",
"A language class practicing pronunciation",
"A workshop on public speaking skills",
"A high school physics lesson on electromagnetism",
"A training program for IT professionals",
"A lecture on climate change and its effects",
"A psychology class studying cognitive psychology",
"A music class composing original songs",
"A nursing school simulation for patient assessment",
"A computer science class on data structures",
"A workshop on 3D modeling and animation",
"A law school lecture on contract law",
"A geography class examining world maps",
"A vocational training program for plumbers",
"A history seminar discussing revolutions",
"A biology class exploring genetics",
"A financial literacy course for teens",
"A literature class analyzing poetry",
"A training session for public speaking coaches",
"A sociology lecture on cultural diversity",
"An art class creating sculptures",
"A medical school seminar on surgical techniques",
"A programming bootcamp teaching app development",
"An economics class on global trade policies",
"A chemistry lab experiment on chemical bonding",
"A language class discussing idiomatic expressions",
"A workshop on conflict resolution",
"A high school biology lesson on evolution",
"A training program for project managers",
"A lecture on renewable energy sources",
"A psychology class on abnormal psychology",
"A music class rehearsing for a performance",
"A nursing school simulation for emergency response",
"A computer science class on cybersecurity",
"A workshop on digital marketing strategies",
"A law school lecture on intellectual property",
"A geology class analyzing seismic activity",
"A vocational training program for carpenters",
"A history seminar on the Renaissance",
"A chemistry class synthesizing compounds",
"A financial literacy course for seniors",
"A literature class interpreting Shakespearean plays",
"A training session for negotiation skills",
"A sociology lecture on urbanization",
"An art class creating digital art",
"A medical school seminar on patient communication",
"A programming bootcamp teaching mobile app development",
"An economics class on fiscal policy",
"A physics lab experiment on electromagnetism",
"A language class on cultural immersion",
"A workshop on time management",
"A high school chemistry lesson on stoichiometry",
"A training program for HR professionals",
"A lecture on space exploration and astronomy",
"A psychology class on human development",
"A music class practicing for a recital",
"A nursing school simulation for triage",
"A computer science class on web development frameworks",
"A workshop on team-building exercises",
"A law school lecture on criminal law",
"A geography class studying world cultures",
"A vocational training program for HVAC technicians",
"A history seminar on ancient civilizations",
"A biology class examining ecosystems",
"A financial literacy course for entrepreneurs",
"A literature class analyzing modern literature",
"A training session for leadership skills",
"A sociology lecture on gender studies",
"An art class exploring multimedia art",
"A medical school seminar on patient diagnosis",
"A programming bootcamp teaching software architecture"
]
ACADEMIC_SUBJECTS = [
"Astrophysics",
"Microbiology",
"Political Science",
"Environmental Science",
"Literature",
"Biochemistry",
"Sociology",
"Art History",
"Geology",
"Economics",
"Psychology",
"History of Architecture",
"Linguistics",
"Neurobiology",
"Anthropology",
"Quantum Mechanics",
"Urban Planning",
"Philosophy",
"Marine Biology",
"International Relations",
"Medieval History",
"Geophysics",
"Finance",
"Educational Psychology",
"Graphic Design",
"Paleontology",
"Macroeconomics",
"Cognitive Psychology",
"Renaissance Art",
"Archaeology",
"Microeconomics",
"Social Psychology",
"Contemporary Art",
"Meteorology",
"Political Philosophy",
"Space Exploration",
"Cognitive Science",
"Classical Music",
"Oceanography",
"Public Health",
"Gender Studies",
"Baroque Art",
"Volcanology",
"Business Ethics",
"Music Composition",
"Environmental Policy",
"Media Studies",
"Ancient History",
"Seismology",
"Marketing",
"Human Development",
"Modern Art",
"Astronomy",
"International Law",
"Developmental Psychology",
"Film Studies",
"American History",
"Soil Science",
"Entrepreneurship",
"Clinical Psychology",
"Contemporary Dance",
"Space Physics",
"Political Economy",
"Cognitive Neuroscience",
"20th Century Literature",
"Public Administration",
"European History",
"Atmospheric Science",
"Supply Chain Management",
"Social Work",
"Japanese Literature",
"Planetary Science",
"Labor Economics",
"Industrial-Organizational Psychology",
"French Philosophy",
"Biogeochemistry",
"Strategic Management",
"Educational Sociology",
"Postmodern Literature",
"Public Relations",
"Middle Eastern History",
"Oceanography",
"International Development",
"Human Resources Management",
"Educational Leadership",
"Russian Literature",
"Quantum Chemistry",
"Environmental Economics",
"Environmental Psychology",
"Ancient Philosophy",
"Immunology",
"Comparative Politics",
"Child Development",
"Fashion Design",
"Geological Engineering",
"Macroeconomic Policy",
"Media Psychology",
"Byzantine Art",
"Ecology",
"International Business"
]

View File

@@ -0,0 +1,108 @@
import os
from dependency_injector import providers, containers
from firebase_admin import credentials
from openai import AsyncOpenAI
from httpx import AsyncClient as HTTPClient
from google.cloud.firestore_v1 import AsyncClient as FirestoreClient
from dotenv import load_dotenv
from app.repositories.impl import *
from app.services.impl import *
from app.controllers.impl import *
load_dotenv()
def config_di(
*, polly_client: any, http_client: HTTPClient, whisper_model: any
) -> None:
"""
Loads up all the common configs of all the environments
and then calls the specific env configs
"""
# Firebase token
cred = credentials.Certificate(os.getenv("GOOGLE_APPLICATION_CREDENTIALS"))
firebase_token = cred.get_access_token().access_token
container = containers.DynamicContainer()
openai_client = providers.Singleton(AsyncOpenAI)
polly_client = providers.Object(polly_client)
http_client = providers.Object(http_client)
firestore_client = providers.Singleton(FirestoreClient)
whisper_model = providers.Object(whisper_model)
llm = providers.Factory(OpenAI, client=openai_client)
stt = providers.Factory(OpenAIWhisper, model=whisper_model)
tts = providers.Factory(AWSPolly, client=polly_client)
vid_gen = providers.Factory(Heygen, client=http_client, heygen_token=os.getenv("HEY_GEN_TOKEN"))
ai_detector = providers.Factory(GPTZero, client=http_client, gpt_zero_key=os.getenv("GPT_ZERO_API_KEY"))
firebase_instance = providers.Factory(
FirebaseStorage, client=http_client, token=firebase_token, bucket=os.getenv("FIREBASE_BUCKET")
)
firestore = providers.Factory(Firestore, client=firestore_client)
# Services
listening_service = providers.Factory(
ListeningService, llm=llm, tts=tts, file_storage=firebase_instance, document_store=firestore
)
reading_service = providers.Factory(ReadingService, llm=llm)
speaking_service = providers.Factory(
SpeakingService, llm=llm, vid_gen=vid_gen,
file_storage=firebase_instance, document_store=firestore,
stt=stt
)
writing_service = providers.Factory(WritingService, llm=llm, ai_detector=ai_detector)
level_service = providers.Factory(
LevelService, llm=llm, document_store=firestore, reading_service=reading_service
)
grade_service = providers.Factory(
GradeService, llm=llm
)
training_service = providers.Factory(
TrainingService, llm=llm
)
# Controllers
container.grade_controller = providers.Factory(
GradeController, grade_service=grade_service, speaking_service=speaking_service, writing_service=writing_service
)
container.training_controller = providers.Factory(
TrainingController, training_service=training_service
)
container.level_controller = providers.Factory(
LevelController, level_service=level_service
)
container.listening_controller = providers.Factory(
ListeningController, listening_service=listening_service
)
container.reading_controller = providers.Factory(
ReadingController, reading_service=reading_service
)
container.speaking_controller = providers.Factory(
SpeakingController, speaking_service=speaking_service
)
container.writing_controller = providers.Factory(
WritingController, writing_service=writing_service
)
container.llm = llm
container.wire(
packages=["app"]
)

View File

@@ -0,0 +1,7 @@
from .filters import ErrorAndAboveFilter
from .queue_handler import QueueListenerHandler
__all__ = [
"ErrorAndAboveFilter",
"QueueListenerHandler"
]

View File

@@ -0,0 +1,6 @@
import logging
class ErrorAndAboveFilter(logging.Filter):
def filter(self, record: logging.LogRecord) -> bool | logging.LogRecord:
return record.levelno < logging.ERROR

View File

@@ -0,0 +1,105 @@
import datetime as dt
import json
import logging
LOG_RECORD_BUILTIN_ATTRS = {
"args",
"asctime",
"created",
"exc_info",
"exc_text",
"filename",
"funcName",
"levelname",
"levelno",
"lineno",
"module",
"msecs",
"message",
"msg",
"name",
"pathname",
"process",
"processName",
"relativeCreated",
"stack_info",
"thread",
"threadName",
"taskName",
}
"""
This isn't being used since the app will be run on gcloud run but this can be used for future apps.
If you want to test it:
formatters:
"json": {
"()": "json_formatter.JSONFormatter",
"fmt_keys": {
"level": "levelname",
"message": "message",
"timestamp": "timestamp",
"logger": "name",
"module": "module",
"function": "funcName",
"line": "lineno",
"thread_name": "threadName"
}
}
handlers:
"file_json": {
"class": "logging.handlers.RotatingFileHandler",
"level": "DEBUG",
"formatter": "json",
"filename": "logs/log",
"maxBytes": 1000000,
"backupCount": 3
}
and add "cfg://handlers.file_json" to queue handler
"""
# From this video https://www.youtube.com/watch?v=9L77QExPmI0
# Src here: https://github.com/mCodingLLC/VideosSampleCode/blob/master/videos/135_modern_logging/mylogger.py
class JSONFormatter(logging.Formatter):
def __init__(
self,
*,
fmt_keys: dict[str, str] | None = None,
):
super().__init__()
self.fmt_keys = fmt_keys if fmt_keys is not None else {}
def format(self, record: logging.LogRecord) -> str:
message = self._prepare_log_dict(record)
return json.dumps(message, default=str)
def _prepare_log_dict(self, record: logging.LogRecord):
always_fields = {
"message": record.getMessage(),
"timestamp": dt.datetime.fromtimestamp(
record.created, tz=dt.timezone.utc
).isoformat(),
}
if record.exc_info is not None:
always_fields["exc_info"] = self.formatException(record.exc_info)
if record.stack_info is not None:
always_fields["stack_info"] = self.formatStack(record.stack_info)
message = {
key: msg_val
if (msg_val := always_fields.pop(val, None)) is not None
else getattr(record, val)
for key, val in self.fmt_keys.items()
}
message.update(always_fields)
for key, val in record.__dict__.items():
if key not in LOG_RECORD_BUILTIN_ATTRS:
message[key] = val
return message

View File

@@ -0,0 +1,53 @@
{
"version": 1,
"objects": {
"queue": {
"class": "queue.Queue",
"maxsize": 1000
}
},
"disable_existing_loggers": false,
"formatters": {
"simple": {
"format": "[%(levelname)s] (%(module)s|L: %(lineno)d) %(asctime)s: %(message)s",
"datefmt": "%Y-%m-%dT%H:%M:%S%z"
}
},
"filters": {
"error_and_above": {
"()": "app.configs.logging.ErrorAndAboveFilter"
}
},
"handlers": {
"console": {
"class": "logging.StreamHandler",
"level": "INFO",
"formatter": "simple",
"stream": "ext://sys.stdout",
"filters": ["error_and_above"]
},
"error": {
"class": "logging.StreamHandler",
"level": "ERROR",
"formatter": "simple",
"stream": "ext://sys.stderr"
},
"queue_handler": {
"class": "app.configs.logging.QueueListenerHandler",
"handlers": [
"cfg://handlers.console",
"cfg://handlers.error"
],
"queue": "cfg://objects.queue",
"respect_handler_level": true
}
},
"loggers": {
"root": {
"level": "DEBUG",
"handlers": [
"queue_handler"
]
}
}
}

View File

@@ -0,0 +1,61 @@
from logging.config import ConvertingList, ConvertingDict, valid_ident
from logging.handlers import QueueHandler, QueueListener
from queue import Queue
import atexit
class QueueHnadlerHelper:
@staticmethod
def resolve_handlers(l):
if not isinstance(l, ConvertingList):
return l
# Indexing the list performs the evaluation.
return [l[i] for i in range(len(l))]
@staticmethod
def resolve_queue(q):
if not isinstance(q, ConvertingDict):
return q
if '__resolved_value__' in q:
return q['__resolved_value__']
cname = q.pop('class')
klass = q.configurator.resolve(cname)
props = q.pop('.', None)
kwargs = {k: q[k] for k in q if valid_ident(k)}
result = klass(**kwargs)
if props:
for name, value in props.items():
setattr(result, name, value)
q['__resolved_value__'] = result
return result
# The guy from this video https://www.youtube.com/watch?v=9L77QExPmI0 is using logging features only available in 3.12
# This article had the class required to build the queue handler in 3.11
# https://rob-blackbourn.medium.com/how-to-use-python-logging-queuehandler-with-dictconfig-1e8b1284e27a
class QueueListenerHandler(QueueHandler):
def __init__(self, handlers, respect_handler_level=False, auto_run=True, queue=Queue(-1)):
queue = QueueHnadlerHelper.resolve_queue(queue)
super().__init__(queue)
handlers = QueueHnadlerHelper.resolve_handlers(handlers)
self._listener = QueueListener(
self.queue,
*handlers,
respect_handler_level=respect_handler_level)
if auto_run:
self.start()
atexit.register(self.stop)
def start(self):
self._listener.start()
def stop(self):
self._listener.stop()
def emit(self, record):
return super().emit(record)

View File

@@ -1,6 +1,7 @@
import uuid import uuid
from helper.constants import * from .constants import MinTimers
def getListeningPartTemplate(): def getListeningPartTemplate():
return { return {
@@ -11,15 +12,17 @@ def getListeningPartTemplate():
"exercises": [] "exercises": []
} }
def getListeningTemplate(): def getListeningTemplate():
return { return {
"parts": [], "parts": [],
"isDiagnostic": False, "isDiagnostic": False,
"minTimer": LISTENING_MIN_TIMER_DEFAULT, "minTimer": MinTimers.LISTENING_MIN_TIMER_DEFAULT,
"module": "listening" "module": "listening"
} }
def getListeningPostSample(): def getListeningPostSample():
return { return {
"parts": [ "parts": [
@@ -1161,7 +1164,7 @@ def getSpeakingTemplate():
} }
], ],
"isDiagnostic": False, "isDiagnostic": False,
"minTimer": SPEAKING_MIN_TIMER_DEFAULT, "minTimer": MinTimers.SPEAKING_MIN_TIMER_DEFAULT,
"module": "speaking" "module": "speaking"
} }
@@ -1223,7 +1226,7 @@ def getWritingTemplate():
} }
], ],
"isDiagnostic": False, "isDiagnostic": False,
"minTimer": WRITING_MIN_TIMER_DEFAULT, "minTimer": MinTimers.WRITING_MIN_TIMER_DEFAULT,
"module": "writing", "module": "writing",
"type": "general" "type": "general"
} }
@@ -1236,3 +1239,37 @@ def getWritingPostSample():
"To what extent do you agree or disagree with the statement that technology has had a positive impact on modern society? In your response, critically examine the opposing perspectives on this issue, considering both the benefits and drawbacks of technological advancements. Support your arguments with relevant examples and evidence, and conclude with your own stance on the matter." "To what extent do you agree or disagree with the statement that technology has had a positive impact on modern society? In your response, critically examine the opposing perspectives on this issue, considering both the benefits and drawbacks of technological advancements. Support your arguments with relevant examples and evidence, and conclude with your own stance on the matter."
] ]
} }
def get_question_tips(question: str, answer: str, correct_answer: str, context: str = None):
messages = [
{
"role": "user",
"content": "You are a IELTS exam program that analyzes incorrect answers to questions and gives tips to "
"help students understand why it was a wrong answer and gives helpful insight for the future. "
"The tip should refer to the context and question.",
}
]
if not (context is None or context == ""):
messages.append({
"role": "user",
"content": f"This is the context for the question: {context}",
})
messages.extend([
{
"role": "user",
"content": f"This is the question: {question}",
},
{
"role": "user",
"content": f"This is the answer: {answer}",
},
{
"role": "user",
"content": f"This is the correct answer: {correct_answer}",
}
])
return messages

View File

View File

@@ -0,0 +1,17 @@
from .level import ILevelController
from .listening import IListeningController
from .reading import IReadingController
from .writing import IWritingController
from .speaking import ISpeakingController
from .grade import IGradeController
from .training import ITrainingController
__all__ = [
"IListeningController",
"IReadingController",
"IWritingController",
"ISpeakingController",
"ILevelController",
"IGradeController",
"ITrainingController"
]

View File

@@ -0,0 +1,26 @@
from abc import ABC, abstractmethod
from typing import Dict
class IGradeController(ABC):
@abstractmethod
async def grade_writing_task(self, task: int, data):
pass
@abstractmethod
async def grade_speaking_task(self, task: int, data: Dict):
pass
@abstractmethod
async def grading_summary(self, data: Dict):
pass
@abstractmethod
async def _grade_speaking_task_1_2(self, task: int, question: str, answer_firebase_path: str):
pass
@abstractmethod
async def _grade_speaking_task3(self, answers: Dict):
pass

View File

@@ -0,0 +1,12 @@
from abc import ABC, abstractmethod
class ILevelController(ABC):
@abstractmethod
async def get_level_exam(self):
pass
@abstractmethod
async def get_level_utas(self):
pass

View File

@@ -0,0 +1,13 @@
from abc import ABC, abstractmethod
from typing import List
class IListeningController(ABC):
@abstractmethod
async def get_listening_question(self, section_id: int, topic: str, exercises: List[str], difficulty: str):
pass
@abstractmethod
async def save_listening(self, data):
pass

View File

@@ -0,0 +1,10 @@
from abc import ABC, abstractmethod
from typing import List
class IReadingController(ABC):
@abstractmethod
async def get_reading_passage(self, passage: int, topic: str, exercises: List[str], difficulty: str):
pass

View File

@@ -0,0 +1,21 @@
from abc import ABC, abstractmethod
from fastapi import BackgroundTasks
class ISpeakingController(ABC):
@abstractmethod
async def get_speaking_task(self, task: int, topic: str, difficulty: str):
pass
@abstractmethod
async def save_speaking(self, data, background_tasks: BackgroundTasks):
pass
@abstractmethod
async def generate_speaking_video(self, data):
pass
@abstractmethod
async def generate_interactive_video(self, data):
pass

View File

@@ -0,0 +1,8 @@
from abc import ABC, abstractmethod
class ITrainingController(ABC):
@abstractmethod
async def fetch_tips(self, data):
pass

View File

@@ -0,0 +1,8 @@
from abc import ABC, abstractmethod
class IWritingController(ABC):
@abstractmethod
async def get_writing_task_general_question(self, task: int, topic: str, difficulty: str):
pass

View File

@@ -0,0 +1,17 @@
from .level import LevelController
from .listening import ListeningController
from .reading import ReadingController
from .speaking import SpeakingController
from .writing import WritingController
from .training import TrainingController
from .grade import GradeController
__all__ = [
"LevelController",
"ListeningController",
"ReadingController",
"SpeakingController",
"WritingController",
"TrainingController",
"GradeController"
]

View File

@@ -0,0 +1,86 @@
import logging
import os
import uuid
from typing import Dict
from fastapi import HTTPException
from pydantic import ValidationError
from app.configs.constants import FilePaths
from app.controllers.abc import IGradeController
from app.dtos.speaking import SpeakingGradeTask1And2DTO, SpeakingGradeTask3DTO
from app.dtos.writing import WritingGradeTaskDTO
from app.helpers import IOHelper
from app.services.abc import ISpeakingService, IWritingService, IGradeService
class GradeController(IGradeController):
def __init__(
self,
grade_service: IGradeService,
speaking_service: ISpeakingService,
writing_service: IWritingService
):
self._service = grade_service
self._speaking_service = speaking_service
self._writing_service = writing_service
self._logger = logging.getLogger(__name__)
async def grade_writing_task(self, task: int, data: WritingGradeTaskDTO):
try:
return await self._writing_service.grade_writing_task(task, data.question, data.answer)
except Exception as e:
return str(e)
async def grade_speaking_task(self, task: int, data: Dict):
try:
if task in {1, 2}:
body = SpeakingGradeTask1And2DTO(**data)
return await self._grade_speaking_task_1_2(task, body.question, body.answer)
else:
body = SpeakingGradeTask3DTO(**data)
return await self._grade_speaking_task3(body.answers)
except ValidationError as e:
raise HTTPException(status_code=422, detail=e.errors())
async def grading_summary(self, data: Dict):
try:
section_keys = ['reading', 'listening', 'writing', 'speaking', 'level']
extracted_sections = self._extract_existing_sections_from_body(data, section_keys)
return await self._service.calculate_grading_summary(extracted_sections)
except Exception as e:
return str(e)
async def _grade_speaking_task_1_2(self, task: int, question: str, answer_firebase_path: str):
sound_file_name = FilePaths.AUDIO_FILES_PATH + str(uuid.uuid4())
try:
IOHelper.delete_files_older_than_one_day(FilePaths.AUDIO_FILES_PATH)
return await self._speaking_service.grade_speaking_task_1_and_2(
task, question, answer_firebase_path, sound_file_name
)
except Exception as e:
os.remove(sound_file_name)
return str(e), 400
async def _grade_speaking_task3(self, answers: Dict):
try:
IOHelper.delete_files_older_than_one_day(FilePaths.AUDIO_FILES_PATH)
return await self._speaking_service.grade_speaking_task_3(answers)
except Exception as e:
return str(e), 400
@staticmethod
def _extract_existing_sections_from_body(my_dict, keys_to_extract):
if 'sections' in my_dict and isinstance(my_dict['sections'], list) and len(my_dict['sections']) > 0:
return list(
filter(
lambda item:
'code' in item and
item['code'] in keys_to_extract and
'grade' in item and
'name' in item,
my_dict['sections']
)
)

View File

@@ -0,0 +1,20 @@
from app.controllers.abc import ILevelController
from app.services.abc import ILevelService
class LevelController(ILevelController):
def __init__(self, level_service: ILevelService):
self._service = level_service
async def get_level_exam(self):
try:
return await self._service.get_level_exam()
except Exception as e:
return str(e)
async def get_level_utas(self):
try:
return await self._service.get_level_utas()
except Exception as e:
return str(e)

View File

@@ -0,0 +1,97 @@
import random
import logging
from typing import List
from app.controllers.abc import IListeningController
from app.dtos import SaveListeningDTO
from app.services.abc import IListeningService
from app.helpers import IOHelper, ExercisesHelper
from app.configs.constants import (
FilePaths, EducationalContent, FieldsAndExercises
)
class ListeningController(IListeningController):
def __init__(self, listening_service: IListeningService):
self._service = listening_service
self._logger = logging.getLogger(__name__)
self._sections = {
"section_1": {
"topic": EducationalContent.TWO_PEOPLE_SCENARIOS,
"exercise_sample_size": 1,
"total_exercises": FieldsAndExercises.TOTAL_LISTENING_SECTION_1_EXERCISES,
"type": "conversation",
"start_id": 1
},
"section_2": {
"topic": EducationalContent.SOCIAL_MONOLOGUE_CONTEXTS,
"exercise_sample_size": 2,
"total_exercises": FieldsAndExercises.TOTAL_LISTENING_SECTION_2_EXERCISES,
"type": "monologue",
"start_id": 11
},
"section_3": {
"topic": EducationalContent.FOUR_PEOPLE_SCENARIOS,
"exercise_sample_size": 1,
"total_exercises": FieldsAndExercises.TOTAL_LISTENING_SECTION_3_EXERCISES,
"type": "conversation",
"start_id": 21
},
"section_4": {
"topic": EducationalContent.ACADEMIC_SUBJECTS,
"exercise_sample_size": 2,
"total_exercises": FieldsAndExercises.TOTAL_LISTENING_SECTION_4_EXERCISES,
"type": "monologue",
"start_id": 31
}
}
async def get_listening_question(self, section_id: int, topic: str, req_exercises: List[str], difficulty: str):
try:
IOHelper.delete_files_older_than_one_day(FilePaths.AUDIO_FILES_PATH)
section = self._sections[f"section_{str(section_id)}"]
if not topic:
topic = random.choice(section["topic"])
if len(req_exercises) == 0:
req_exercises = random.sample(FieldsAndExercises.LISTENING_EXERCISE_TYPES, section["exercise_sample_size"])
number_of_exercises_q = ExercisesHelper.divide_number_into_parts(section["total_exercises"], len(req_exercises))
dialog = await self._service.generate_listening_question(section_id, topic)
if section_id in {1, 3}:
dialog = self.parse_conversation(dialog)
self._logger.info(f'Generated {section["type"]}: {str(dialog)}')
exercises = await self._service.generate_listening_exercises(
section_id, str(dialog), req_exercises, number_of_exercises_q, section["start_id"], difficulty
)
return {
"exercises": exercises,
"text": dialog,
"difficulty": difficulty
}
except Exception as e:
return str(e)
async def save_listening(self, data: SaveListeningDTO):
try:
return await self._service.save_listening(data.parts, data.minTimer, data.difficulty)
except Exception as e:
return str(e)
@staticmethod
def parse_conversation(conversation_data):
conversation_list = conversation_data.get('conversation', [])
readable_text = []
for message in conversation_list:
name = message.get('name', 'Unknown')
text = message.get('text', '')
readable_text.append(f"{name}: {text}")
return "\n".join(readable_text)

View File

@@ -0,0 +1,43 @@
import random
import logging
from typing import List
from app.controllers.abc import IReadingController
from app.services.abc import IReadingService
from app.configs.constants import FieldsAndExercises
from app.helpers import ExercisesHelper
class ReadingController(IReadingController):
def __init__(self, reading_service: IReadingService):
self._service = reading_service
self._logger = logging.getLogger(__name__)
self._passages = {
"passage_1": {
"total_exercises": FieldsAndExercises.TOTAL_READING_PASSAGE_1_EXERCISES
},
"passage_2": {
"total_exercises": FieldsAndExercises.TOTAL_READING_PASSAGE_2_EXERCISES
},
"passage_3": {
"total_exercises": FieldsAndExercises.TOTAL_READING_PASSAGE_3_EXERCISES
}
}
async def get_reading_passage(self, passage_id: int, topic: str, req_exercises: List[str], difficulty: str):
try:
passage = self._passages[f'passage_{str(passage_id)}']
if len(req_exercises) == 0:
req_exercises = random.sample(FieldsAndExercises.READING_EXERCISE_TYPES, 2)
number_of_exercises_q = ExercisesHelper.divide_number_into_parts(
passage["total_exercises"], len(req_exercises)
)
return await self._service.gen_reading_passage(
passage_id, topic, req_exercises, number_of_exercises_q, difficulty
)
except Exception as e:
return str(e)

View File

@@ -0,0 +1,63 @@
import logging
import uuid
from fastapi import BackgroundTasks
from app.controllers.abc import ISpeakingController
from app.dtos import (
SaveSpeakingDTO, SpeakingGenerateVideoDTO,
SpeakingGenerateInteractiveVideoDTO
)
from app.services.abc import ISpeakingService
from app.configs.constants import ExamVariant, MinTimers
from app.configs.question_templates import getSpeakingTemplate
class SpeakingController(ISpeakingController):
def __init__(self, speaking_service: ISpeakingService):
self._service = speaking_service
self._logger = logging.getLogger(__name__)
async def get_speaking_task(self, task: int, topic: str, difficulty: str):
try:
return await self._service.get_speaking_task(task, topic, difficulty)
except Exception as e:
return str(e)
async def save_speaking(self, data: SaveSpeakingDTO, background_tasks: BackgroundTasks):
try:
exercises = data.exercises
min_timer = data.minTimer
template = getSpeakingTemplate()
template["minTimer"] = min_timer
if min_timer < MinTimers.SPEAKING_MIN_TIMER_DEFAULT:
template["variant"] = ExamVariant.PARTIAL.value
else:
template["variant"] = ExamVariant.FULL.value
req_id = str(uuid.uuid4())
self._logger.info(f'Received request to save speaking with id: {req_id}')
background_tasks.add_task(self._service.create_videos_and_save_to_db, exercises, template, req_id)
self._logger.info('Started background task to save speaking.')
# Return response without waiting for create_videos_and_save_to_db to finish
return {**template, "id": req_id}
except Exception as e:
return str(e)
async def generate_speaking_video(self, data: SpeakingGenerateVideoDTO):
try:
return await self._service.generate_speaking_video(data.question, data.topic, data.avatar, data.prompts)
except Exception as e:
return str(e)
async def generate_interactive_video(self, data: SpeakingGenerateInteractiveVideoDTO):
try:
return await self._service.generate_interactive_video(data.questions, data.topic, data.avatar)
except Exception as e:
return str(e)

View File

@@ -0,0 +1,15 @@
from app.controllers.abc import ITrainingController
from app.dtos import TipsDTO
from app.services.abc import ITrainingService
class TrainingController(ITrainingController):
def __init__(self, training_service: ITrainingService):
self._service = training_service
async def fetch_tips(self, data: TipsDTO):
try:
return await self._service.fetch_tips(data.context, data.question, data.answer, data.correct_answer)
except Exception as e:
return str(e)

View File

@@ -0,0 +1,14 @@
from app.controllers.abc import IWritingController
from app.services.abc import IWritingService
class WritingController(IWritingController):
def __init__(self, writing_service: IWritingService):
self._service = writing_service
async def get_writing_task_general_question(self, task: int, topic: str, difficulty: str):
try:
return await self._service.get_writing_task_general_question(task, topic, difficulty)
except Exception as e:
return str(e)

19
app/dtos/__init__.py Normal file
View File

@@ -0,0 +1,19 @@
from .listening import SaveListeningDTO
from .speaking import (
SaveSpeakingDTO, SpeakingGradeTask1And2DTO,
SpeakingGradeTask3DTO, SpeakingGenerateVideoDTO,
SpeakingGenerateInteractiveVideoDTO
)
from .training import TipsDTO
from .writing import WritingGradeTaskDTO
__all__ = [
"SaveListeningDTO",
"SaveSpeakingDTO",
"SpeakingGradeTask1And2DTO",
"SpeakingGradeTask3DTO",
"SpeakingGenerateVideoDTO",
"SpeakingGenerateInteractiveVideoDTO",
"TipsDTO",
"WritingGradeTaskDTO"
]

12
app/dtos/listening.py Normal file
View File

@@ -0,0 +1,12 @@
import random
from typing import List, Dict
from pydantic import BaseModel
from app.configs.constants import MinTimers, EducationalContent
class SaveListeningDTO(BaseModel):
parts: List[Dict]
minTimer: int = MinTimers.LISTENING_MIN_TIMER_DEFAULT
difficulty: str = random.choice(EducationalContent.DIFFICULTIES)

34
app/dtos/speaking.py Normal file
View File

@@ -0,0 +1,34 @@
import random
from typing import List, Dict
from pydantic import BaseModel
from app.configs.constants import MinTimers, AvatarEnum
class SaveSpeakingDTO(BaseModel):
exercises: List[Dict]
minTimer: int = MinTimers.SPEAKING_MIN_TIMER_DEFAULT
class SpeakingGradeTask1And2DTO(BaseModel):
question: str
answer: str
class SpeakingGradeTask3DTO(BaseModel):
answers: Dict
class SpeakingGenerateVideoDTO(BaseModel):
avatar: str = (random.choice(list(AvatarEnum))).value
prompts: List[str] = []
question: str
topic: str
class SpeakingGenerateInteractiveVideoDTO(BaseModel):
avatar: str = (random.choice(list(AvatarEnum))).value
questions: List[str]
topic: str

8
app/dtos/training.py Normal file
View File

@@ -0,0 +1,8 @@
from pydantic import BaseModel
class TipsDTO(BaseModel):
context: str
question: str
answer: str
correct_answer: str

6
app/dtos/writing.py Normal file
View File

@@ -0,0 +1,6 @@
from pydantic import BaseModel
class WritingGradeTaskDTO(BaseModel):
question: str
answer: str

View File

@@ -0,0 +1,6 @@
from .exceptions import CustomException, UnauthorizedException
__all__ = [
"CustomException",
"UnauthorizedException"
]

View File

@@ -0,0 +1,17 @@
from http import HTTPStatus
class CustomException(Exception):
code = HTTPStatus.INTERNAL_SERVER_ERROR
error_code = HTTPStatus.INTERNAL_SERVER_ERROR
message = HTTPStatus.INTERNAL_SERVER_ERROR.description
def __init__(self, message=None):
if message:
self.message = message
class UnauthorizedException(CustomException):
code = HTTPStatus.UNAUTHORIZED
error_code = HTTPStatus.UNAUTHORIZED
message = HTTPStatus.UNAUTHORIZED.description

11
app/helpers/__init__.py Normal file
View File

@@ -0,0 +1,11 @@
from .io import IOHelper
from .text_helper import TextHelper
from .token_counter import count_tokens
from .exercises_helper import ExercisesHelper
__all__ = [
"IOHelper",
"TextHelper",
"count_tokens",
"ExercisesHelper"
]

View File

@@ -0,0 +1,195 @@
import queue
import random
import re
import string
from wonderwords import RandomWord
from .text_helper import TextHelper
class ExercisesHelper:
@staticmethod
def divide_number_into_parts(number, parts):
if number < parts:
return None
part_size = number // parts
remaining = number % parts
q = queue.Queue()
for i in range(parts):
if i < remaining:
q.put(part_size + 1)
else:
q.put(part_size)
return q
@staticmethod
def fix_exercise_ids(exercise, start_id):
# Initialize the starting ID for the first exercise
current_id = start_id
questions = exercise["questions"]
# Iterate through questions and update the "id" value
for question in questions:
question["id"] = str(current_id)
current_id += 1
return exercise
@staticmethod
def replace_first_occurrences_with_placeholders(text: str, words_to_replace: list, start_id):
for i, word in enumerate(words_to_replace, start=start_id):
# Create a case-insensitive regular expression pattern
pattern = re.compile(r'\b' + re.escape(word) + r'\b', re.IGNORECASE)
placeholder = '{{' + str(i) + '}}'
text = pattern.sub(placeholder, text, 1)
return text
@staticmethod
def replace_first_occurrences_with_placeholders_notes(notes: list, words_to_replace: list, start_id):
replaced_notes = []
for i, note in enumerate(notes, start=0):
word = words_to_replace[i]
pattern = re.compile(r'\b' + re.escape(word) + r'\b', re.IGNORECASE)
placeholder = '{{' + str(start_id + i) + '}}'
note = pattern.sub(placeholder, note, 1)
replaced_notes.append(note)
return replaced_notes
@staticmethod
def add_random_words_and_shuffle(word_array, num_random_words):
r = RandomWord()
random_words_selected = r.random_words(num_random_words)
combined_array = word_array + random_words_selected
random.shuffle(combined_array)
return combined_array
@staticmethod
def fillblanks_build_solutions_array(words, start_id):
solutions = []
for i, word in enumerate(words, start=start_id):
solutions.append(
{
"id": str(i),
"solution": word
}
)
return solutions
@staticmethod
def remove_excess_questions(questions: [], quantity):
count_true = 0
result = []
for item in reversed(questions):
if item.get('solution') == 'true' and count_true < quantity:
count_true += 1
else:
result.append(item)
result.reverse()
return result
@staticmethod
def build_write_blanks_text(questions: [], start_id):
result = ""
for i, q in enumerate(questions, start=start_id):
placeholder = '{{' + str(i) + '}}'
result = result + q["question"] + placeholder + "\\n"
return result
@staticmethod
def build_write_blanks_text_form(form: [], start_id):
result = ""
replaced_words = []
for i, entry in enumerate(form, start=start_id):
placeholder = '{{' + str(i) + '}}'
# Use regular expression to find the string after ':'
match = re.search(r'(?<=:)\s*(.*)', entry)
# Extract the matched string
original_string = match.group(1)
# Split the string into words
words = re.findall(r'\b\w+\b', original_string)
# Remove words with only one letter
filtered_words = [word for word in words if len(word) > 1]
# Choose a random word from the list of words
selected_word = random.choice(filtered_words)
pattern = re.compile(r'\b' + re.escape(selected_word) + r'\b', re.IGNORECASE)
# Replace the chosen word with the placeholder
replaced_string = pattern.sub(placeholder, original_string, 1)
# Construct the final replaced string
replaced_string = entry.replace(original_string, replaced_string)
result = result + replaced_string + "\\n"
# Save the replaced word or use it as needed
# For example, you can save it to a file or a list
replaced_words.append(selected_word)
return result, replaced_words
@staticmethod
def build_write_blanks_solutions(questions: [], start_id):
solutions = []
for i, q in enumerate(questions, start=start_id):
solution = [q["possible_answers"]] if isinstance(q["possible_answers"], str) else q["possible_answers"]
solutions.append(
{
"id": str(i),
"solution": solution
}
)
return solutions
@staticmethod
def build_write_blanks_solutions_listening(words: [], start_id):
solutions = []
for i, word in enumerate(words, start=start_id):
solution = [word] if isinstance(word, str) else word
solutions.append(
{
"id": str(i),
"solution": solution
}
)
return solutions
@staticmethod
def answer_word_limit_ok(question):
# Check if any option in any solution has more than three words
return not any(
len(option.split()) > 3
for solution in question["solutions"]
for option in solution["solution"]
)
@staticmethod
def assign_letters_to_paragraphs(paragraphs):
result = []
letters = iter(string.ascii_uppercase)
for paragraph in paragraphs.split("\n\n"):
if TextHelper.has_x_words(paragraph, 10):
result.append({'paragraph': paragraph.strip(), 'letter': next(letters)})
return result
@staticmethod
def contains_empty_dict(arr):
return any(elem == {} for elem in arr)
@staticmethod
def fix_writing_overall(overall: float, task_response: dict):
if overall > max(task_response.values()) or overall < min(task_response.values()):
total_sum = sum(task_response.values())
average = total_sum / len(task_response.values())
rounded_average = round(average, 0)
return rounded_average
return overall

20
app/helpers/io.py Normal file
View File

@@ -0,0 +1,20 @@
import datetime
import os
from pathlib import Path
class IOHelper:
@staticmethod
def delete_files_older_than_one_day(directory: str):
current_time = datetime.datetime.now()
for entry in os.scandir(directory):
if entry.is_file():
file_path = Path(entry)
file_name = file_path.name
file_modified_time = datetime.datetime.fromtimestamp(file_path.stat().st_mtime)
time_difference = current_time - file_modified_time
if time_difference.days > 1 and "placeholder" not in file_name:
file_path.unlink()
print(f"Deleted file: {file_path}")

View File

@@ -0,0 +1,28 @@
from nltk.corpus import words
class TextHelper:
@classmethod
def has_words(cls, text: str):
if not cls._has_common_words(text):
return False
english_words = set(words.words())
words_in_input = text.split()
return any(word.lower() in english_words for word in words_in_input)
@classmethod
def has_x_words(cls, text: str, quantity):
if not cls._has_common_words(text):
return False
english_words = set(words.words())
words_in_input = text.split()
english_word_count = sum(1 for word in words_in_input if word.lower() in english_words)
return english_word_count >= quantity
@staticmethod
def _has_common_words(text: str):
english_words = {"the", "be", "to", "of", "and", "a", "in", "that", "have", "i"}
words_in_input = text.split()
english_word_count = sum(1 for word in words_in_input if word.lower() in english_words)
return english_word_count >= 10

View File

@@ -1,5 +1,8 @@
# This is a work in progress. There are still bugs. Once it is production-ready this will become a full repo. # This is a work in progress. There are still bugs. Once it is production-ready this will become a full repo.
import tiktoken
import nltk
def count_tokens(text, model_name="gpt-3.5-turbo", debug=False): def count_tokens(text, model_name="gpt-3.5-turbo", debug=False):
""" """
@@ -38,7 +41,6 @@ def count_tokens(text, model_name="gpt-3.5-turbo", debug=False):
# Try using tiktoken # Try using tiktoken
try: try:
import tiktoken
encoding = tiktoken.encoding_for_model(model_name) encoding = tiktoken.encoding_for_model(model_name)
num_tokens = len(encoding.encode(text)) num_tokens = len(encoding.encode(text))
result = {"n_tokens": num_tokens, "method": "tiktoken"} result = {"n_tokens": num_tokens, "method": "tiktoken"}
@@ -50,8 +52,7 @@ def count_tokens(text, model_name="gpt-3.5-turbo", debug=False):
# Try using nltk # Try using nltk
try: try:
import nltk # Passed nltk.download("punkt") to server.py's @asynccontextmanager
nltk.download("punkt")
tokens = nltk.word_tokenize(text) tokens = nltk.word_tokenize(text)
result = {"n_tokens": len(tokens), "method": "nltk"} result = {"n_tokens": len(tokens), "method": "nltk"}
return result return result

View File

@@ -0,0 +1,9 @@
from .authentication import AuthBackend, AuthenticationMiddleware
from .authorization import Authorized, IsAuthenticatedViaBearerToken
__all__ = [
"AuthBackend",
"AuthenticationMiddleware",
"Authorized",
"IsAuthenticatedViaBearerToken"
]

View File

@@ -0,0 +1,48 @@
import os
from typing import Tuple
import jwt
from jwt import InvalidTokenError
from pydantic import BaseModel, Field
from starlette.authentication import AuthenticationBackend
from starlette.middleware.authentication import (
AuthenticationMiddleware as BaseAuthenticationMiddleware,
)
from starlette.requests import HTTPConnection
class Session(BaseModel):
authenticated: bool = Field(False, description="Is user authenticated?")
class AuthBackend(AuthenticationBackend):
async def authenticate(
self, conn: HTTPConnection
) -> Tuple[bool, Session]:
session = Session()
authorization: str = conn.headers.get("Authorization")
if not authorization:
return False, session
try:
scheme, token = authorization.split(" ")
if scheme.lower() != "bearer":
return False, session
except ValueError:
return False, session
jwt_secret_key = os.getenv("JWT_SECRET_KEY")
if not jwt_secret_key:
return False, session
try:
jwt.decode(token, jwt_secret_key, algorithms=["HS256"])
except InvalidTokenError:
return False, session
session.authenticated = True
return True, session
class AuthenticationMiddleware(BaseAuthenticationMiddleware):
pass

View File

@@ -0,0 +1,36 @@
from abc import ABC, abstractmethod
from typing import List, Type
from fastapi import Request
from fastapi.openapi.models import APIKey, APIKeyIn
from fastapi.security.base import SecurityBase
from app.exceptions import CustomException, UnauthorizedException
class BaseAuthorization(ABC):
exception = CustomException
@abstractmethod
async def has_permission(self, request: Request) -> bool:
pass
class IsAuthenticatedViaBearerToken(BaseAuthorization):
exception = UnauthorizedException
async def has_permission(self, request: Request) -> bool:
return request.user.authenticated
class Authorized(SecurityBase):
def __init__(self, permissions: List[Type[BaseAuthorization]]):
self.permissions = permissions
self.model: APIKey = APIKey(**{"in": APIKeyIn.header}, name="Authorization")
self.scheme_name = self.__class__.__name__
async def __call__(self, request: Request):
for permission in self.permissions:
cls = permission()
if not await cls.has_permission(request=request):
raise cls.exception

View File

View File

@@ -0,0 +1,7 @@
from .file_storage import IFileStorage
from .document_store import IDocumentStore
__all__ = [
"IFileStorage",
"IDocumentStore"
]

View File

@@ -0,0 +1,13 @@
from abc import ABC
class IDocumentStore(ABC):
async def save_to_db(self, collection: str, item):
pass
async def save_to_db_with_id(self, collection: str, item, id: str):
pass
async def get_all(self, collection: str):
pass

View File

@@ -0,0 +1,16 @@
from abc import ABC, abstractmethod
class IFileStorage(ABC):
@abstractmethod
async def download_firebase_file(self, source_blob_name, destination_file_name):
pass
@abstractmethod
async def upload_file_firebase_get_url(self, destination_blob_name, source_file_name):
pass
@abstractmethod
async def make_public(self, blob_name: str):
pass

View File

@@ -0,0 +1,8 @@
from .document_stores import *
from .firebase import FirebaseStorage
__all__ = [
"FirebaseStorage"
]
__all__.extend(document_stores.__all__)

View File

@@ -0,0 +1,7 @@
from .firestore import Firestore
#from .mongo import MongoDB
__all__ = [
"Firestore",
#"MongoDB"
]

View File

@@ -0,0 +1,38 @@
import logging
from google.cloud.firestore_v1.async_client import AsyncClient
from google.cloud.firestore_v1.async_collection import AsyncCollectionReference
from google.cloud.firestore_v1.async_document import AsyncDocumentReference
from app.repositories.abc import IDocumentStore
class Firestore(IDocumentStore):
def __init__(self, client: AsyncClient):
self._client = client
self._logger = logging.getLogger(__name__)
async def save_to_db(self, collection: str, item):
collection_ref: AsyncCollectionReference = self._client.collection(collection)
update_time, document_ref = await collection_ref.add(item)
if document_ref:
self._logger.info(f"Document added with ID: {document_ref.id}")
return True, document_ref.id
else:
return False, None
async def save_to_db_with_id(self, collection: str, item, id: str):
collection_ref: AsyncCollectionReference = self._client.collection(collection)
document_ref: AsyncDocumentReference = collection_ref.document(id)
await document_ref.set(item)
doc_snapshot = await document_ref.get()
if doc_snapshot.exists:
self._logger.info(f"Document added with ID: {document_ref.id}")
return True, document_ref.id
else:
return False, None
async def get_all(self, collection: str):
collection_ref: AsyncCollectionReference = self._client.collection(collection)
docs = []
async for doc in collection_ref.stream():
docs.append(doc.to_dict())
return docs

View File

@@ -0,0 +1,36 @@
"""import logging
from pymongo import MongoClient
from app.repositories.abc import IDocumentStore
class MongoDB(IDocumentStore):
def __init__(self, client: MongoClient):
self._client = client
self._logger = logging.getLogger(__name__)
def save_to_db(self, collection: str, item):
collection_ref = self._client[collection]
result = collection_ref.insert_one(item)
if result.inserted_id:
self._logger.info(f"Document added with ID: {result.inserted_id}")
return True, str(result.inserted_id)
else:
return False, None
def save_to_db_with_id(self, collection: str, item, doc_id: str):
collection_ref = self._client[collection]
item['_id'] = doc_id
result = collection_ref.replace_one({'_id': id}, item, upsert=True)
if result.upserted_id or result.matched_count:
self._logger.info(f"Document added with ID: {doc_id}")
return True, doc_id
else:
return False, None
def get_all(self, collection: str):
collection_ref = self._client[collection]
all_documents = list(collection_ref.find())
return all_documents
"""

View File

@@ -0,0 +1,83 @@
import logging
from typing import Optional
import aiofiles
from httpx import AsyncClient
from app.repositories.abc import IFileStorage
class FirebaseStorage(IFileStorage):
def __init__(self, client: AsyncClient, token: str, bucket: str):
self._httpx_client = client
self._token = token
self._storage_url = f'https://firebasestorage.googleapis.com/v0/b/{bucket}'
self._logger = logging.getLogger(__name__)
async def download_firebase_file(self, source_blob_name: str, destination_file_name: str) -> Optional[str]:
source_blob_name = source_blob_name.replace('/', '%2F')
download_url = f"{self._storage_url}/o/{source_blob_name}?alt=media"
response = await self._httpx_client.get(
download_url,
headers={'Authorization': f'Firebase {self._token}'}
)
if response.status_code == 200:
async with aiofiles.open(destination_file_name, 'wb') as file:
await file.write(response.content)
self._logger.info(f"File downloaded to {destination_file_name}")
return destination_file_name
else:
self._logger.error(f"Failed to download blob {source_blob_name}. {response.status_code} - {response.content}")
return None
async def upload_file_firebase_get_url(self, destination_blob_name: str, source_file_name: str) -> Optional[str]:
destination_blob_name = destination_blob_name.replace('/', '%2F')
upload_url = f"{self._storage_url}/o/{destination_blob_name}"
async with aiofiles.open(source_file_name, 'rb') as file:
file_bytes = await file.read()
response = await self._httpx_client.post(
upload_url,
headers={
'Authorization': f'Firebase {self._token}',
"X-Goog-Upload-Protocol": "multipart"
},
files={
'metadata': (None, '{"metadata":{"test":"testMetadata"}}', 'application/json'),
'file': file_bytes
}
)
if response.status_code == 200:
self._logger.info(f"File {source_file_name} uploaded to {self._storage_url}/o/{destination_blob_name}.")
# TODO: Test this
#await self.make_public(destination_blob_name)
file_url = f"{self._storage_url}/o/{destination_blob_name}"
return file_url
else:
self._logger.error(f"Failed to upload file {source_file_name}. Error: {response.status_code} - {str(response.content)}")
return None
async def make_public(self, destination_blob_name: str):
acl_url = f"{self._storage_url}/o/{destination_blob_name}/acl"
acl = {'entity': 'allUsers', 'role': 'READER'}
response = await self._httpx_client.post(
acl_url,
headers={
'Authorization': f'Bearer {self._token}',
'Content-Type': 'application/json'
},
json=acl
)
if response.status_code == 200:
self._logger.info(f"Blob {destination_blob_name} is now public.")
else:
self._logger.error(f"Failed to make blob {destination_blob_name} public. {response.status_code} - {response.content}")

149
app/server.py Normal file
View File

@@ -0,0 +1,149 @@
import json
import os
import pathlib
import logging.config
import logging.handlers
import aioboto3
import contextlib
from contextlib import asynccontextmanager
from collections import defaultdict
from typing import List
from http import HTTPStatus
import httpx
import whisper
from fastapi import FastAPI, Request
from fastapi.encoders import jsonable_encoder
from fastapi.exceptions import RequestValidationError
from fastapi.middleware import Middleware
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
import nltk
from dotenv import load_dotenv
from starlette import status
from app.api import router
from app.configs import config_di
from app.exceptions import CustomException
from app.middlewares import AuthenticationMiddleware, AuthBackend
load_dotenv()
@asynccontextmanager
async def lifespan(_app: FastAPI):
"""
Startup and Shutdown logic is in this lifespan method
https://fastapi.tiangolo.com/advanced/events/
"""
# Whisper model
whisper_model = whisper.load_model("base")
# NLTK required datasets download
nltk.download('words')
nltk.download("punkt")
# AWS Polly client instantiation
context_stack = contextlib.AsyncExitStack()
session = aioboto3.Session()
polly_client = await context_stack.enter_async_context(
session.client(
'polly',
region_name='eu-west-1',
aws_secret_access_key=os.getenv("AWS_ACCESS_KEY_ID"),
aws_access_key_id=os.getenv("AWS_SECRET_ACCESS_KEY")
)
)
# HTTP Client
http_client = httpx.AsyncClient()
config_di(
polly_client=polly_client,
http_client=http_client,
whisper_model=whisper_model
)
# Setup logging
config_file = pathlib.Path("./app/configs/logging/logging_config.json")
with open(config_file) as f_in:
config = json.load(f_in)
logging.config.dictConfig(config)
yield
await http_client.aclose()
await polly_client.close()
await context_stack.aclose()
def setup_listeners(_app: FastAPI) -> None:
@_app.exception_handler(RequestValidationError)
async def custom_form_validation_error(request, exc):
"""
Don't delete request param
"""
reformatted_message = defaultdict(list)
for pydantic_error in exc.errors():
loc, msg = pydantic_error["loc"], pydantic_error["msg"]
filtered_loc = loc[1:] if loc[0] in ("body", "query", "path") else loc
field_string = ".".join(filtered_loc)
if field_string == "cookie.refresh_token":
return JSONResponse(
status_code=401,
content={"error_code": 401, "message": HTTPStatus.UNAUTHORIZED.description},
)
reformatted_message[field_string].append(msg)
return JSONResponse(
status_code=status.HTTP_400_BAD_REQUEST,
content=jsonable_encoder(
{"details": "Invalid request!", "errors": reformatted_message}
),
)
@_app.exception_handler(CustomException)
async def custom_exception_handler(request: Request, exc: CustomException):
"""
Don't delete request param
"""
return JSONResponse(
status_code=exc.code,
content={"error_code": exc.error_code, "message": exc.message},
)
def setup_middleware() -> List[Middleware]:
middleware = [
Middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
),
Middleware(
AuthenticationMiddleware,
backend=AuthBackend()
)
]
return middleware
def create_app() -> FastAPI:
_app = FastAPI(
docs_url=None,
redoc_url=None,
middleware=setup_middleware(),
lifespan=lifespan
)
_app.include_router(router)
setup_listeners(_app)
return _app
app = create_app()

0
app/services/__init__.py Normal file
View File

View File

@@ -0,0 +1,19 @@
from .level import ILevelService
from .listening import IListeningService
from .writing import IWritingService
from .speaking import ISpeakingService
from .reading import IReadingService
from .grade import IGradeService
from .training import ITrainingService
from .third_parties import *
__all__ = [
"ILevelService",
"IListeningService",
"IWritingService",
"ISpeakingService",
"IReadingService",
"IGradeService",
"ITrainingService"
]
__all__.extend(third_parties.__all__)

23
app/services/abc/grade.py Normal file
View File

@@ -0,0 +1,23 @@
from abc import ABC, abstractmethod
from typing import Dict, List
class IGradeService(ABC):
@abstractmethod
async def calculate_grading_summary(self, extracted_sections: List):
pass
@abstractmethod
async def _calculate_section_grade_summary(self, section):
pass
@staticmethod
@abstractmethod
def _parse_openai_response(response):
pass
@staticmethod
@abstractmethod
def _parse_bullet_points(bullet_points_str, grade):
pass

24
app/services/abc/level.py Normal file
View File

@@ -0,0 +1,24 @@
from abc import ABC, abstractmethod
class ILevelService(ABC):
@abstractmethod
async def get_level_exam(self):
pass
@abstractmethod
async def get_level_utas(self):
pass
@abstractmethod
async def _gen_multiple_choice_level(self, quantity: int, start_id=1):
pass
@abstractmethod
async def _replace_exercise_if_exists(self, all_exams, current_exercise, current_exam, seen_keys):
pass
@abstractmethod
async def _generate_single_mc_level_question(self):
pass

View File

@@ -0,0 +1,68 @@
from abc import ABC, abstractmethod
from queue import Queue
from typing import Dict
class IListeningService(ABC):
@abstractmethod
async def generate_listening_question(self, section: int, topic: str) -> Dict:
pass
@abstractmethod
async def generate_listening_exercises(
self, section: int, dialog: str,
req_exercises: list[str], exercises_queue: Queue,
start_id: int, difficulty: str
):
pass
@abstractmethod
async def save_listening(self, parts, min_timer, difficulty):
pass
# ==================================================================================================================
# Helpers
# ==================================================================================================================
@abstractmethod
async def _generate_listening_conversation(self, section: int, topic: str) -> Dict:
pass
@abstractmethod
async def _generate_listening_monologue(self, section: int, topic: str) -> Dict:
pass
@abstractmethod
def _get_conversation_voices(self, response: Dict, unique_voices_across_segments: bool):
pass
@staticmethod
@abstractmethod
def _get_random_voice(gender: str):
pass
@abstractmethod
async def _gen_multiple_choice_exercise_listening(
self, dialog_type: str, text: str, quantity: int, start_id, difficulty
):
pass
@abstractmethod
async def _gen_write_blanks_questions_exercise_listening(
self, dialog_type: str, text: str, quantity: int, start_id, difficulty
):
pass
@abstractmethod
async def _gen_write_blanks_notes_exercise_listening(
self, dialog_type: str, text: str, quantity: int, start_id, difficulty
):
pass
@abstractmethod
async def _gen_write_blanks_form_exercise_listening(
self, dialog_type: str, text: str, quantity: int, start_id, difficulty
):
pass

View File

@@ -0,0 +1,49 @@
from abc import ABC, abstractmethod
from queue import Queue
from typing import List
from app.configs.constants import QuestionType
class IReadingService(ABC):
@abstractmethod
async def gen_reading_passage(
self,
passage_id: int,
topic: str,
req_exercises: List[str],
number_of_exercises_q: Queue,
difficulty: str
):
pass
# ==================================================================================================================
# Helpers
# ==================================================================================================================
@abstractmethod
async def generate_reading_passage(self, q_type: QuestionType, topic: str):
pass
@abstractmethod
async def _generate_reading_exercises(
self, passage: str, req_exercises: list, number_of_exercises_q, start_id, difficulty
):
pass
@abstractmethod
async def _gen_summary_fill_blanks_exercise(self, text: str, quantity: int, start_id, difficulty):
pass
@abstractmethod
async def _gen_true_false_not_given_exercise(self, text: str, quantity: int, start_id, difficulty):
pass
@abstractmethod
async def _gen_write_blanks_exercise(self, text: str, quantity: int, start_id, difficulty):
pass
@abstractmethod
async def _gen_paragraph_match_exercise(self, text: str, quantity: int, start_id):
pass

View File

@@ -0,0 +1,57 @@
from abc import ABC, abstractmethod
from typing import List, Dict
class ISpeakingService(ABC):
@abstractmethod
async def get_speaking_task(self, task_id: int, topic: str, difficulty: str):
pass
@abstractmethod
async def grade_speaking_task_1_and_2(
self, task: int, question: str, answer_firebase_path: str, sound_file_name: str
):
pass
@abstractmethod
async def grade_speaking_task_3(self, answers: Dict, task: int = 3):
pass
@abstractmethod
async def create_videos_and_save_to_db(self, exercises: List[Dict], template: Dict, req_id: str):
pass
@abstractmethod
async def generate_speaking_video(self, original_question: str, topic: str, avatar: str, prompts: List[str]):
pass
@abstractmethod
async def generate_interactive_video(self, questions: List[str], avatar: str, topic: str):
pass
# ==================================================================================================================
# Helpers
# ==================================================================================================================
@staticmethod
@abstractmethod
def _zero_rating(comment: str):
pass
@staticmethod
@abstractmethod
def _calculate_overall(response: Dict):
pass
@abstractmethod
async def _get_speaking_corrections(self, text):
pass
@abstractmethod
async def _create_video_per_part(self, exercises: List[Dict], template: Dict, part: int):
pass
@abstractmethod
async def _create_video(self, question: str, avatar: str, error_message: str):
pass

View File

@@ -0,0 +1,13 @@
from .stt import ISpeechToTextService
from .tts import ITextToSpeechService
from .llm import ILLMService
from .vid_gen import IVideoGeneratorService
from .ai_detector import IAIDetectorService
__all__ = [
"ISpeechToTextService",
"ITextToSpeechService",
"ILLMService",
"IVideoGeneratorService",
"IAIDetectorService"
]

View File

@@ -0,0 +1,13 @@
from abc import ABC, abstractmethod
from typing import Dict, Optional
class IAIDetectorService(ABC):
@abstractmethod
async def run_detection(self, text: str):
pass
@abstractmethod
def _parse_detection(self, response: Dict) -> Optional[Dict]:
pass

View File

@@ -0,0 +1,21 @@
from abc import ABC, abstractmethod
from typing import List, Optional
class ILLMService(ABC):
@abstractmethod
async def prediction(
self,
model: str,
messages: List,
fields_to_check: Optional[List[str]],
temperature: float,
check_blacklisted: bool = True,
token_count: int = -1
):
pass
@abstractmethod
async def prediction_override(self, **kwargs):
pass

View File

@@ -0,0 +1,8 @@
from abc import ABC, abstractmethod
class ISpeechToTextService(ABC):
@abstractmethod
async def speech_to_text(self, file_path):
pass

View File

@@ -0,0 +1,22 @@
from abc import ABC, abstractmethod
from typing import Union
class ITextToSpeechService(ABC):
@abstractmethod
async def synthesize_speech(self, text: str, voice: str, engine: str, output_format: str):
pass
@abstractmethod
async def text_to_speech(self, text: Union[list[str], str], file_name: str):
pass
@abstractmethod
async def _conversation_to_speech(self, conversation: list):
pass
@abstractmethod
async def _text_to_speech(self, text: str):
pass

View File

@@ -0,0 +1,10 @@
from abc import ABC, abstractmethod
from app.configs.constants import AvatarEnum
class IVideoGeneratorService(ABC):
@abstractmethod
async def create_video(self, text: str, avatar: str):
pass

View File

@@ -0,0 +1,13 @@
from abc import ABC, abstractmethod
class ITrainingService(ABC):
@abstractmethod
async def fetch_tips(self, context: str, question: str, answer: str, correct_answer: str):
pass
@staticmethod
@abstractmethod
def _get_question_tips(question: str, answer: str, correct_answer: str, context: str = None):
pass

View File

@@ -0,0 +1,32 @@
from abc import ABC, abstractmethod
from typing import Dict
class IWritingService(ABC):
@abstractmethod
async def get_writing_task_general_question(self, task: int, topic: str, difficulty: str):
pass
@abstractmethod
async def grade_writing_task(self, task: int, question: str, answer: str):
pass
# ==================================================================================================================
# Helpers
# ==================================================================================================================
@staticmethod
@abstractmethod
def _get_writing_prompt(task: int, topic: str, difficulty: str):
pass
@staticmethod
@abstractmethod
async def _get_fixed_text(self, text):
pass
@staticmethod
@abstractmethod
def _zero_rating(comment: str):
pass

View File

@@ -0,0 +1,19 @@
from .level import LevelService
from .listening import ListeningService
from .reading import ReadingService
from .speaking import SpeakingService
from .writing import WritingService
from .grade import GradeService
from .training import TrainingService
from .third_parties import *
__all__ = [
"LevelService",
"ListeningService",
"ReadingService",
"SpeakingService",
"WritingService",
"GradeService",
"TrainingService"
]
__all__.extend(third_parties.__all__)

156
app/services/impl/grade.py Normal file
View File

@@ -0,0 +1,156 @@
import json
from typing import List
import copy
from app.services.abc import ILLMService, IGradeService
class GradeService(IGradeService):
chat_config = {'max_tokens': 1000, 'temperature': 0.2}
tools = [{
"type": "function",
"function": {
"name": "save_evaluation_and_suggestions",
"description": "Saves the evaluation and suggestions requested by input.",
"parameters": {
"type": "object",
"properties": {
"evaluation": {
"type": "string",
"description": "A comment on the IELTS section grade obtained in the specific section and what it could mean without suggestions.",
},
"suggestions": {
"type": "string",
"description": "A small paragraph text with suggestions on how to possibly get a better grade than the one obtained.",
},
"bullet_points": {
"type": "string",
"description": "Text with four bullet points to improve the english speaking ability. Only include text for the bullet points separated by a paragraph. ",
},
},
"required": ["evaluation", "suggestions"],
},
}
}]
def __init__(self, llm: ILLMService):
self._llm = llm
async def calculate_grading_summary(self, extracted_sections: List):
ret = []
for section in extracted_sections:
openai_response_dict = await self._calculate_section_grade_summary(section)
ret.append(
{
'code': section['code'],
'name': section['name'],
'grade': section['grade'],
'evaluation': openai_response_dict['evaluation'],
'suggestions': openai_response_dict['suggestions'],
'bullet_points': self._parse_bullet_points(openai_response_dict['bullet_points'], section['grade'])
}
)
return {'sections': ret}
async def _calculate_section_grade_summary(self, section):
section_name = section['name']
section_grade = section['grade']
messages = [
{
"role": "user",
"content": (
'You are a IELTS test section grade evaluator. You will receive a IELTS test section name and the '
'grade obtained in the section. You should offer a evaluation comment on this grade and separately '
'suggestions on how to possibly get a better grade.'
)
},
{
"role": "user",
"content": f'Section: {str(section_name)} Grade: {str(section_grade)}',
},
{
"role": "user",
"content": "Speak in third person."
},
{
"role": "user",
"content": "Don't offer suggestions in the evaluation comment. Only in the suggestions section."
},
{
"role": "user",
"content": (
"Your evaluation comment on the grade should enunciate the grade, be insightful, be speculative, "
"be one paragraph long."
)
},
{
"role": "user",
"content": "Please save the evaluation comment and suggestions generated."
},
{
"role": "user",
"content": f"Offer bullet points to improve the english {str(section_name)} ability."
},
]
if section['code'] == "level":
messages[2:2] = [{
"role": "user",
"content": (
"This section is comprised of multiple choice questions that measure the user's overall english "
"level. These multiple choice questions are about knowledge on vocabulary, syntax, grammar rules, "
"and contextual usage. The grade obtained measures the ability in these areas and english language "
"overall."
)
}]
elif section['code'] == "speaking":
messages[2:2] = [{
"role": "user",
"content": (
"This section is s designed to assess the English language proficiency of individuals who want to "
"study or work in English-speaking countries. The speaking section evaluates a candidate's ability "
"to communicate effectively in spoken English."
)
}]
chat_config = copy.deepcopy(self.chat_config)
tools = copy.deepcopy(self.tools)
res = await self._llm.prediction_override(
model="gpt-3.5-turbo",
max_tokens=chat_config['max_tokens'],
temperature=chat_config['temperature'],
tools=tools,
messages=messages
)
return self._parse_openai_response(res)
@staticmethod
def _parse_openai_response(response):
if 'choices' in response and len(response['choices']) > 0 and 'message' in response['choices'][
0] and 'tool_calls' in response['choices'][0]['message'] and isinstance(
response['choices'][0]['message']['tool_calls'], list) and len(
response['choices'][0]['message']['tool_calls']) > 0 and \
response['choices'][0]['message']['tool_calls'][0]['function']['arguments']:
return json.loads(response['choices'][0]['message']['tool_calls'][0]['function']['arguments'])
else:
return {'evaluation': "", 'suggestions': "", 'bullet_points': []}
@staticmethod
def _parse_bullet_points(bullet_points_str, grade):
max_grade_for_suggestions = 9
if isinstance(bullet_points_str, str) and grade < max_grade_for_suggestions:
# Split the string by '\n'
lines = bullet_points_str.split('\n')
# Remove '-' and trim whitespace from each line
cleaned_lines = [line.replace('-', '').strip() for line in lines]
# Add '.' to lines that don't end with it
return [line + '.' if line and not line.endswith('.') else line for line in cleaned_lines]
else:
return []

506
app/services/impl/level.py Normal file
View File

@@ -0,0 +1,506 @@
import json
import random
import uuid
from app.configs.constants import GPTModels, TemperatureSettings, EducationalContent, QuestionType
from app.helpers import ExercisesHelper
from app.repositories.abc import IDocumentStore
from app.services.abc import ILevelService, ILLMService, IReadingService
class LevelService(ILevelService):
def __init__(
self, llm: ILLMService, document_store: IDocumentStore, reading_service: IReadingService
):
self._llm = llm
self._document_store = document_store
self._reading_service = reading_service
async def get_level_exam(self):
number_of_exercises = 25
exercises = await self._gen_multiple_choice_level(number_of_exercises)
return {
"exercises": [exercises],
"isDiagnostic": False,
"minTimer": 25,
"module": "level"
}
async def _gen_multiple_choice_level(self, quantity: int, start_id=1):
gen_multiple_choice_for_text = (
f'Generate {str(quantity)} multiple choice questions of 4 options for an english level exam, some easy '
'questions, some intermediate questions and some advanced questions. Ensure that the questions cover '
'a range of topics such as verb tense, subject-verb agreement, pronoun usage, sentence structure, and '
'punctuation. Make sure every question only has 1 correct answer.'
)
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"questions": [{"id": "9", "options": '
'[{"id": "A", "text": "And"}, {"id": "B", "text": "Cat"}, '
'{"id": "C", "text": "Happy"}, {"id": "D", "text": "Jump"}], '
'"prompt": "Which of the following is a conjunction?", '
'"solution": "A", "variant": "text"}]}'
)
},
{
"role": "user",
"content": gen_multiple_choice_for_text
}
]
question = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["questions"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
if len(question["questions"]) != quantity:
return await self._gen_multiple_choice_level(quantity, start_id)
else:
all_exams = await self._document_store.get_all("level")
seen_keys = set()
for i in range(len(question["questions"])):
question["questions"][i], seen_keys = await self._replace_exercise_if_exists(
all_exams, question["questions"][i], question, seen_keys
)
return {
"id": str(uuid.uuid4()),
"prompt": "Select the appropriate option.",
"questions": ExercisesHelper.fix_exercise_ids(question, start_id)["questions"],
"type": "multipleChoice",
}
async def _replace_exercise_if_exists(self, all_exams, current_exercise, current_exam, seen_keys):
# Extracting relevant fields for comparison
key = (current_exercise['prompt'], tuple(sorted(option['text'] for option in current_exercise['options'])))
# Check if the key is in the set
if key in seen_keys:
return await self._replace_exercise_if_exists(
all_exams, await self._generate_single_mc_level_question(), current_exam, seen_keys
)
else:
seen_keys.add(key)
for exam in all_exams:
exam_dict = exam.to_dict()
if any(
exercise["prompt"] == current_exercise["prompt"] and
any(exercise["options"][0]["text"] == current_option["text"] for current_option in
current_exercise["options"])
for exercise in exam_dict.get("exercises", [])[0]["questions"]
):
return await self._replace_exercise_if_exists(
all_exams, await self._generate_single_mc_level_question(), current_exam, seen_keys
)
return current_exercise, seen_keys
async def _generate_single_mc_level_question(self):
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"id": "9", "options": [{"id": "A", "text": "And"}, {"id": "B", "text": "Cat"}, '
'{"id": "C", "text": "Happy"}, {"id": "D", "text": "Jump"}], '
'"prompt": "Which of the following is a conjunction?", '
'"solution": "A", "variant": "text"}'
)
},
{
"role": "user",
"content": (
'Generate 1 multiple choice question of 4 options for an english level exam, it can be easy, '
'intermediate or advanced.'
)
}
]
question = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["options"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
return question
async def get_level_utas(self):
# Formats
mc = {
"id": str(uuid.uuid4()),
"prompt": "Choose the correct word or group of words that completes the sentences.",
"questions": None,
"type": "multipleChoice",
"part": 1
}
umc = {
"id": str(uuid.uuid4()),
"prompt": "Choose the underlined word or group of words that is not correct.",
"questions": None,
"type": "multipleChoice",
"part": 2
}
bs_1 = {
"id": str(uuid.uuid4()),
"prompt": "Read the text and write the correct word for each space.",
"questions": None,
"type": "blankSpaceText",
"part": 3
}
bs_2 = {
"id": str(uuid.uuid4()),
"prompt": "Read the text and write the correct word for each space.",
"questions": None,
"type": "blankSpaceText",
"part": 4
}
reading = {
"id": str(uuid.uuid4()),
"prompt": "Read the text and answer the questions below.",
"questions": None,
"type": "readingExercises",
"part": 5
}
all_mc_questions = []
# PART 1
mc_exercises1 = await self._gen_multiple_choice_blank_space_utas(15, 1, all_mc_questions)
print(json.dumps(mc_exercises1, indent=4))
all_mc_questions.append(mc_exercises1)
# PART 2
mc_exercises2 = await self._gen_multiple_choice_blank_space_utas(15, 16, all_mc_questions)
print(json.dumps(mc_exercises2, indent=4))
all_mc_questions.append(mc_exercises2)
# PART 3
mc_exercises3 = await self._gen_multiple_choice_blank_space_utas(15, 31, all_mc_questions)
print(json.dumps(mc_exercises3, indent=4))
all_mc_questions.append(mc_exercises3)
mc_exercises = mc_exercises1['questions'] + mc_exercises2['questions'] + mc_exercises3['questions']
print(json.dumps(mc_exercises, indent=4))
mc["questions"] = mc_exercises
# Underlined mc
underlined_mc = await self._gen_multiple_choice_underlined_utas(15, 46)
print(json.dumps(underlined_mc, indent=4))
umc["questions"] = underlined_mc
# Blank Space text 1
blank_space_text_1 = await self._gen_blank_space_text_utas(12, 61, 250)
print(json.dumps(blank_space_text_1, indent=4))
bs_1["questions"] = blank_space_text_1
# Blank Space text 2
blank_space_text_2 = await self._gen_blank_space_text_utas(14, 73, 350)
print(json.dumps(blank_space_text_2, indent=4))
bs_2["questions"] = blank_space_text_2
# Reading text
reading_text = await self._gen_reading_passage_utas(87, 10, 4)
print(json.dumps(reading_text, indent=4))
reading["questions"] = reading_text
return {
"exercises": {
"blankSpaceMultipleChoice": mc,
"underlinedMultipleChoice": umc,
"blankSpaceText1": bs_1,
"blankSpaceText2": bs_2,
"readingExercises": reading,
},
"isDiagnostic": False,
"minTimer": 25,
"module": "level"
}
async def _gen_multiple_choice_blank_space_utas(self, quantity: int, start_id: int, all_exams):
gen_multiple_choice_for_text = (
f'Generate {str(quantity)} multiple choice blank space questions of 4 options for an english '
'level exam, some easy questions, some intermediate questions and some advanced questions. Ensure '
'that the questions cover a range of topics such as verb tense, subject-verb agreement, pronoun usage, '
'sentence structure, and punctuation. Make sure every question only has 1 correct answer.'
)
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"questions": [{"id": "9", "options": [{"id": "A", "text": '
'"And"}, {"id": "B", "text": "Cat"}, {"id": "C", "text": '
'"Happy"}, {"id": "D", "text": "Jump"}], '
'"prompt": "Which of the following is a conjunction?", '
'"solution": "A", "variant": "text"}]}')
},
{
"role": "user",
"content": gen_multiple_choice_for_text
}
]
question = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["questions"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
if len(question["questions"]) != quantity:
return await self._gen_multiple_choice_level(quantity, start_id)
else:
seen_keys = set()
for i in range(len(question["questions"])):
question["questions"][i], seen_keys = await self._replace_exercise_if_exists_utas(
all_exams,
question["questions"][i],
question,
seen_keys
)
return ExercisesHelper.fix_exercise_ids(question, start_id)
async def _replace_exercise_if_exists_utas(self, all_exams, current_exercise, current_exam, seen_keys):
# Extracting relevant fields for comparison
key = (current_exercise['prompt'], tuple(sorted(option['text'] for option in current_exercise['options'])))
# Check if the key is in the set
if key in seen_keys:
return self._replace_exercise_if_exists_utas(
all_exams, await self._generate_single_mc_level_question(), current_exam, seen_keys
)
else:
seen_keys.add(key)
for exam in all_exams:
if any(
exercise["prompt"] == current_exercise["prompt"] and
any(exercise["options"][0]["text"] == current_option["text"] for current_option in
current_exercise["options"])
for exercise in exam.get("questions", [])
):
return self._replace_exercise_if_exists_utas(
all_exams, await self._generate_single_mc_level_question(), current_exam, seen_keys
)
return current_exercise, seen_keys
async def _gen_multiple_choice_underlined_utas(self, quantity: int, start_id: int):
json_format = {
"questions": [
{
"id": "9",
"options": [
{
"id": "A",
"text": "a"
},
{
"id": "B",
"text": "b"
},
{
"id": "C",
"text": "c"
},
{
"id": "D",
"text": "d"
}
],
"prompt": "prompt",
"solution": "A",
"variant": "text"
}
]
}
gen_multiple_choice_for_text = (
f'Generate {str(quantity)} multiple choice questions of 4 options for an english '
'level exam, some easy questions, some intermediate questions and some advanced questions. Ensure that '
'the questions cover a range of topics such as verb tense, subject-verb agreement, pronoun usage, '
'sentence structure, and punctuation. Make sure every question only has 1 correct answer.'
)
messages = [
{
"role": "system",
"content": 'You are a helpful assistant designed to output JSON on this format: ' + str(json_format)
},
{
"role": "user",
"content": gen_multiple_choice_for_text
},
{
"role": "user",
"content": (
'The type of multiple choice is the prompt has wrong words or group of words and the options '
'are to find the wrong word or group of words that are underlined in the prompt. \nExample:\n'
'Prompt: "I <u>complain</u> about my boss <u>all the time</u>, but my colleagues <u>thinks</u> '
'the boss <u>is</u> nice."\nOptions:\na: "complain"\nb: "all the time"\nc: "thinks"\nd: "is"'
)
}
]
question = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["questions"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
if len(question["questions"]) != quantity:
return await self._gen_multiple_choice_level(quantity, start_id)
else:
return ExercisesHelper.fix_exercise_ids(question, start_id)["questions"]
async def _gen_blank_space_text_utas(
self, quantity: int, start_id: int, size: int, topic=random.choice(EducationalContent.MTI_TOPICS)
):
json_format = {
"question": {
"words": [
{
"id": "1",
"text": "a"
},
{
"id": "2",
"text": "b"
},
{
"id": "3",
"text": "c"
},
{
"id": "4",
"text": "d"
}
],
"text": "text"
}
}
messages = [
{
"role": "system",
"content": 'You are a helpful assistant designed to output JSON on this format: ' + str(json_format)
},
{
"role": "user",
"content": f'Generate a text of at least {str(size)} words about the topic {topic}.'
},
{
"role": "user",
"content": (
f'From the generated text choose {str(quantity)} words (cannot be sequential words) to replace '
'once with {{id}} where id starts on ' + str(start_id) + ' and is incremented for each word. '
'The ids must be ordered throughout the text and the words must be replaced only once. Put '
'the removed words and respective ids on the words array of the json in the correct order.'
)
}
]
question = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["question"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
return question["question"]
async def _gen_reading_passage_utas(
self, start_id, sa_quantity: int, mc_quantity: int, topic=random.choice(EducationalContent.MTI_TOPICS)
):
passage = await self._reading_service.generate_reading_passage(QuestionType.READING_PASSAGE_1, topic)
short_answer = await self._gen_short_answer_utas(passage["text"], start_id, sa_quantity)
mc_exercises = await self._gen_text_multiple_choice_utas(passage["text"], start_id + sa_quantity, mc_quantity)
return {
"exercises": {
"shortAnswer": short_answer,
"multipleChoice": mc_exercises,
},
"text": {
"content": passage["text"],
"title": passage["title"]
}
}
async def _gen_short_answer_utas(self, text: str, start_id: int, sa_quantity: int):
json_format = {"questions": [{"id": 1, "question": "question", "possible_answers": ["answer_1", "answer_2"]}]}
messages = [
{
"role": "system",
"content": 'You are a helpful assistant designed to output JSON on this format: ' + str(json_format)
},
{
"role": "user",
"content": (
'Generate ' + str(sa_quantity) + ' short answer questions, and the possible answers, must have '
'maximum 3 words per answer, about this text:\n"' + text + '"')
},
{
"role": "user",
"content": 'The id starts at ' + str(start_id) + '.'
}
]
return (
await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["questions"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
)["questions"]
async def _gen_text_multiple_choice_utas(self, text: str, start_id: int, mc_quantity: int):
json_format = {
"questions": [
{
"id": "9",
"options": [
{
"id": "A",
"text": "a"
},
{
"id": "B",
"text": "b"
},
{
"id": "C",
"text": "c"
},
{
"id": "D",
"text": "d"
}
],
"prompt": "prompt",
"solution": "A",
"variant": "text"
}
]
}
messages = [
{
"role": "system",
"content": 'You are a helpful assistant designed to output JSON on this format: ' + str(json_format)
},
{
"role": "user",
"content": 'Generate ' + str(
mc_quantity) + ' multiple choice questions of 4 options for this text:\n' + text
},
{
"role": "user",
"content": 'Make sure every question only has 1 correct answer.'
}
]
question = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["questions"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
if len(question["questions"]) != mc_quantity:
return await self._gen_multiple_choice_level(mc_quantity, start_id)
else:
return ExercisesHelper.fix_exercise_ids(question, start_id)["questions"]

View File

@@ -0,0 +1,393 @@
import uuid
from queue import Queue
import random
from typing import Dict
from app.repositories.abc import IFileStorage, IDocumentStore
from app.services.abc import IListeningService, ILLMService, ITextToSpeechService
from app.configs.question_templates import getListeningTemplate, getListeningPartTemplate
from app.configs.constants import (
NeuralVoices, GPTModels, TemperatureSettings, FilePaths, MinTimers, ExamVariant
)
from app.helpers import ExercisesHelper
class ListeningService(IListeningService):
CONVERSATION_TAIL = (
"Please include random names and genders for the characters in your dialogue. "
"Make sure that the generated conversation does not contain forbidden subjects in muslim countries."
)
MONOLOGUE_TAIL = (
"Make sure that the generated monologue does not contain forbidden subjects in muslim countries."
)
def __init__(
self, llm: ILLMService,
tts: ITextToSpeechService,
file_storage: IFileStorage,
document_store: IDocumentStore
):
self._llm = llm
self._tts = tts
self._file_storage = file_storage
self._document_store = document_store
self._sections = {
"section_1": {
"generate_dialogue": self._generate_listening_conversation,
"type": "conversation"
},
"section_2": {
"generate_dialogue": self._generate_listening_monologue,
"type": "monologue"
},
"section_3": {
"generate_dialogue": self._generate_listening_conversation,
"type": "conversation"
},
"section_4": {
"generate_dialogue": self._generate_listening_monologue,
"type": "monologue"
}
}
async def generate_listening_question(self, section: int, topic: str):
return await self._sections[f'section_{section}']["generate_dialogue"](section, topic)
async def generate_listening_exercises(
self, section: int, dialog: str,
req_exercises: list[str], number_of_exercises_q: Queue,
start_id: int, difficulty: str
):
dialog_type = self._sections[f'section_{section}']["type"]
exercises = []
for req_exercise in req_exercises:
number_of_exercises = number_of_exercises_q.get()
if req_exercise == "multipleChoice":
question = await self._gen_multiple_choice_exercise_listening(
dialog_type, dialog, number_of_exercises, start_id, difficulty
)
exercises.append(question)
print("Added multiple choice: " + str(question))
elif req_exercise == "writeBlanksQuestions":
question = await self._gen_write_blanks_questions_exercise_listening(
dialog_type, dialog, number_of_exercises, start_id, difficulty
)
exercises.append(question)
print("Added write blanks questions: " + str(question))
elif req_exercise == "writeBlanksFill":
question = await self._gen_write_blanks_notes_exercise_listening(
dialog_type, dialog, number_of_exercises, start_id, difficulty
)
exercises.append(question)
print("Added write blanks notes: " + str(question))
elif req_exercise == "writeBlanksForm":
question = await self._gen_write_blanks_form_exercise_listening(
dialog_type, dialog, number_of_exercises, start_id, difficulty
)
exercises.append(question)
print("Added write blanks form: " + str(question))
start_id = start_id + number_of_exercises
return exercises
async def save_listening(self, parts: list[dict], min_timer: int, difficulty: str):
template = getListeningTemplate()
template['difficulty'] = difficulty
listening_id = str(uuid.uuid4())
for i, part in enumerate(parts, start=0):
part_template = getListeningPartTemplate()
file_name = str(uuid.uuid4()) + ".mp3"
sound_file_path = FilePaths.AUDIO_FILES_PATH + file_name
firebase_file_path = FilePaths.FIREBASE_LISTENING_AUDIO_FILES_PATH + file_name
if "conversation" in part["text"]:
await self._tts.text_to_speech(part["text"]["conversation"], sound_file_path)
else:
await self._tts.text_to_speech(part["text"], sound_file_path)
file_url = await self._file_storage.upload_file_firebase_get_url(firebase_file_path, sound_file_path)
part_template["audio"]["source"] = file_url
part_template["exercises"] = part["exercises"]
template['parts'].append(part_template)
if min_timer != MinTimers.LISTENING_MIN_TIMER_DEFAULT:
template["minTimer"] = min_timer
template["variant"] = ExamVariant.PARTIAL.value
else:
template["variant"] = ExamVariant.FULL.value
(result, listening_id) = await self._document_store.save_to_db_with_id("listening", template, listening_id)
if result:
return {**template, "id": listening_id}
else:
raise Exception("Failed to save question: " + str(parts))
# ==================================================================================================================
# generate_listening_question helpers
# ==================================================================================================================
async def _generate_listening_conversation(self, section: int, topic: str) -> Dict:
head = (
'Compose an authentic conversation between two individuals in the everyday social context of "'
if section == 1 else
'Compose an authentic and elaborate conversation between up to four individuals in the everyday '
'social context of "'
)
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"conversation": [{"name": "name", "gender": "gender", "text": "text"}]}')
},
{
"role": "user",
"content": (
f'{head}{topic}". {self.CONVERSATION_TAIL}'
)
}
]
response = await self._llm.prediction(
GPTModels.GPT_4_O,
messages,
["conversation"],
TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
return self._get_conversation_voices(response, True)
async def _generate_listening_monologue(self, section: int, topic: str) -> Dict:
context = 'social context' if section == 2 else 'academic subject'
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"monologue": "monologue"}')
},
{
"role": "user",
"content": (
f'Generate a comprehensive monologue set in the {context} of "{topic}". {self.MONOLOGUE_TAIL}'
)
}
]
response = await self._llm.prediction(
GPTModels.GPT_4_O,
messages,
["monologue"],
TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
return response["monologue"]
def _get_conversation_voices(self, response: Dict, unique_voices_across_segments: bool):
chosen_voices = []
name_to_voice = {}
for segment in response['conversation']:
if 'voice' not in segment:
name = segment['name']
if name in name_to_voice:
voice = name_to_voice[name]
else:
voice = None
# section 1
if unique_voices_across_segments:
while voice is None:
chosen_voice = self._get_random_voice(segment['gender'])
if chosen_voice not in chosen_voices:
voice = chosen_voice
chosen_voices.append(voice)
# section 3
else:
voice = self._get_random_voice(segment['gender'])
name_to_voice[name] = voice
segment['voice'] = voice
return response
@staticmethod
def _get_random_voice(gender: str):
if gender.lower() == 'male':
available_voices = NeuralVoices.MALE_NEURAL_VOICES
else:
available_voices = NeuralVoices.FEMALE_NEURAL_VOICES
return random.choice(available_voices)['Id']
# ==================================================================================================================
# generate_listening_exercises helpers
# ==================================================================================================================
async def _gen_multiple_choice_exercise_listening(
self, dialog_type: str, text: str, quantity: int, start_id, difficulty
):
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"questions": [{"id": "9", "options": [{"id": "A", "text": "Economic benefits"}, {"id": "B", "text": '
'"Government regulations"}, {"id": "C", "text": "Concerns about climate change"}, {"id": "D", "text": '
'"Technological advancement"}], "prompt": "What is the main reason for the shift towards renewable '
'energy sources?", "solution": "C", "variant": "text"}]}')
},
{
"role": "user",
"content": (
f'Generate {str(quantity)} {difficulty} difficulty multiple choice questions of 4 options '
f'for this {dialog_type}:\n"' + text + '"')
}
]
questions = await self._llm.prediction(
GPTModels.GPT_4_O,
messages,
["questions"],
TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
return {
"id": str(uuid.uuid4()),
"prompt": "Select the appropriate option.",
"questions": ExercisesHelper.fix_exercise_ids(questions, start_id)["questions"],
"type": "multipleChoice",
}
async def _gen_write_blanks_questions_exercise_listening(
self, dialog_type: str, text: str, quantity: int, start_id, difficulty
):
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"questions": [{"question": question, "possible_answers": ["answer_1", "answer_2"]}]}')
},
{
"role": "user",
"content": (
f'Generate {str(quantity)} {difficulty} difficulty short answer questions, and the '
f'possible answers (max 3 words per answer), about this {dialog_type}:\n"{text}"')
}
]
questions = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["questions"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
questions = questions["questions"][:quantity]
return {
"id": str(uuid.uuid4()),
"maxWords": 3,
"prompt": f"You will hear a {dialog_type}. Answer the questions below using no more than three words or a number accordingly.",
"solutions": ExercisesHelper.build_write_blanks_solutions(questions, start_id),
"text": ExercisesHelper.build_write_blanks_text(questions, start_id),
"type": "writeBlanks"
}
async def _gen_write_blanks_notes_exercise_listening(
self, dialog_type: str, text: str, quantity: int, start_id, difficulty
):
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"notes": ["note_1", "note_2"]}')
},
{
"role": "user",
"content": (
f'Generate {str(quantity)} {difficulty} difficulty notes taken from this '
f'{dialog_type}:\n"{text}"'
)
}
]
questions = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["notes"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
questions = questions["notes"][:quantity]
formatted_phrases = "\n".join([f"{i + 1}. {phrase}" for i, phrase in enumerate(questions)])
word_messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this '
'format: {"words": ["word_1", "word_2"] }'
)
},
{
"role": "user",
"content": ('Select 1 word from each phrase in this list:\n"' + formatted_phrases + '"')
}
]
words = await self._llm.prediction(
GPTModels.GPT_4_O, word_messages, ["words"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
words = words["words"][:quantity]
replaced_notes = ExercisesHelper.replace_first_occurrences_with_placeholders_notes(questions, words, start_id)
return {
"id": str(uuid.uuid4()),
"maxWords": 3,
"prompt": "Fill the blank space with the word missing from the audio.",
"solutions": ExercisesHelper.build_write_blanks_solutions_listening(words, start_id),
"text": "\\n".join(replaced_notes),
"type": "writeBlanks"
}
async def _gen_write_blanks_form_exercise_listening(
self, dialog_type: str, text: str, quantity: int, start_id, difficulty
):
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"form": ["key: value", "key2: value"]}')
},
{
"role": "user",
"content": (
f'Generate a form with {str(quantity)} {difficulty} difficulty key-value pairs '
f'about this {dialog_type}:\n"{text}"'
)
}
]
parsed_form = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["form"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
parsed_form = parsed_form["form"][:quantity]
replaced_form, words = ExercisesHelper.build_write_blanks_text_form(parsed_form, start_id)
return {
"id": str(uuid.uuid4()),
"maxWords": 3,
"prompt": f"You will hear a {dialog_type}. Fill the form with words/numbers missing.",
"solutions": ExercisesHelper.build_write_blanks_solutions_listening(words, start_id),
"text": replaced_form,
"type": "writeBlanks"
}

View File

@@ -0,0 +1,287 @@
import random
import uuid
from queue import Queue
from typing import List
from app.services.abc import IReadingService, ILLMService
from app.configs.constants import QuestionType, TemperatureSettings, FieldsAndExercises, GPTModels
from app.helpers import ExercisesHelper
class ReadingService(IReadingService):
def __init__(self, llm: ILLMService):
self._llm = llm
self._passages = {
"passage_1": {
"question_type": QuestionType.READING_PASSAGE_1,
"start_id": 1
},
"passage_2": {
"question_type": QuestionType.READING_PASSAGE_2,
"start_id": 14
},
"passage_3": {
"question_type": QuestionType.READING_PASSAGE_3,
"start_id": 27
}
}
async def gen_reading_passage(
self,
passage_id: int,
topic: str,
req_exercises: List[str],
number_of_exercises_q: Queue,
difficulty: str
):
_passage = self._passages[f'passage_{str(passage_id)}']
passage = await self.generate_reading_passage(_passage["question_type"], topic)
if passage == "":
return await self.gen_reading_passage(passage_id, topic, req_exercises, number_of_exercises_q, difficulty)
start_id = _passage["start_id"]
exercises = await self._generate_reading_exercises(
passage["text"], req_exercises, number_of_exercises_q, start_id, difficulty
)
if ExercisesHelper.contains_empty_dict(exercises):
return await self.gen_reading_passage(passage_id, topic, req_exercises, number_of_exercises_q, difficulty)
return {
"exercises": exercises,
"text": {
"content": passage["text"],
"title": passage["title"]
},
"difficulty": difficulty
}
async def generate_reading_passage(self, q_type: QuestionType, topic: str):
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"title": "title of the text", "text": "generated text"}')
},
{
"role": "user",
"content": (
f'Generate an extensive text for IELTS {q_type.value}, of at least 1500 words, '
f'on the topic of "{topic}". The passage should offer a substantial amount of '
'information, analysis, or narrative relevant to the chosen subject matter. This text '
'passage aims to serve as the primary reading section of an IELTS test, providing an '
'in-depth and comprehensive exploration of the topic. Make sure that the generated text '
'does not contain forbidden subjects in muslim countries.'
)
}
]
return await self._llm.prediction(
GPTModels.GPT_4_O,
messages,
FieldsAndExercises.GEN_TEXT_FIELDS,
TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
async def _generate_reading_exercises(
self, passage: str, req_exercises: list, number_of_exercises_q, start_id, difficulty
):
exercises = []
for req_exercise in req_exercises:
number_of_exercises = number_of_exercises_q.get()
if req_exercise == "fillBlanks":
question = await self._gen_summary_fill_blanks_exercise(passage, number_of_exercises, start_id, difficulty)
exercises.append(question)
print("Added fill blanks: " + str(question))
elif req_exercise == "trueFalse":
question = await self._gen_true_false_not_given_exercise(passage, number_of_exercises, start_id, difficulty)
exercises.append(question)
print("Added trueFalse: " + str(question))
elif req_exercise == "writeBlanks":
question = await self._gen_write_blanks_exercise(passage, number_of_exercises, start_id, difficulty)
if ExercisesHelper.answer_word_limit_ok(question):
exercises.append(question)
print("Added write blanks: " + str(question))
else:
exercises.append({})
print("Did not add write blanks because it did not respect word limit")
elif req_exercise == "paragraphMatch":
question = await self._gen_paragraph_match_exercise(passage, number_of_exercises, start_id)
exercises.append(question)
print("Added paragraph match: " + str(question))
start_id = start_id + number_of_exercises
return exercises
async def _gen_summary_fill_blanks_exercise(self, text: str, quantity: int, start_id, difficulty):
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{ "summary": "summary", "words": ["word_1", "word_2"] }')
},
{
"role": "user",
"content": (
f'Summarize this text: "{text}"'
)
},
{
"role": "user",
"content": (
f'Select {str(quantity)} {difficulty} difficulty words, it must be words and not '
'expressions, from the summary.'
)
}
]
response = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["summary"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
replaced_summary = ExercisesHelper.replace_first_occurrences_with_placeholders(response["summary"], response["words"], start_id)
options_words = ExercisesHelper.add_random_words_and_shuffle(response["words"], 5)
solutions = ExercisesHelper.fillblanks_build_solutions_array(response["words"], start_id)
return {
"allowRepetition": True,
"id": str(uuid.uuid4()),
"prompt": (
"Complete the summary below. Click a blank to select the corresponding word(s) for it.\\nThere are "
"more words than spaces so you will not use them all. You may use any of the words more than once."
),
"solutions": solutions,
"text": replaced_summary,
"type": "fillBlanks",
"words": options_words
}
async def _gen_true_false_not_given_exercise(self, text: str, quantity: int, start_id, difficulty):
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"prompts":[{"prompt": "statement_1", "solution": "true/false/not_given"}, '
'{"prompt": "statement_2", "solution": "true/false/not_given"}]}')
},
{
"role": "user",
"content": (
f'Generate {str(quantity)} {difficulty} difficulty statements based on the provided text. '
'Ensure that your statements accurately represent information or inferences from the text, and '
'provide a variety of responses, including, at least one of each True, False, and Not Given, '
f'as appropriate.\n\nReference text:\n\n {text}'
)
}
]
response = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["prompts"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
questions = response["prompts"]
if len(questions) > quantity:
questions = ExercisesHelper.remove_excess_questions(questions, len(questions) - quantity)
for i, question in enumerate(questions, start=start_id):
question["id"] = str(i)
return {
"id": str(uuid.uuid4()),
"prompt": "Do the following statements agree with the information given in the Reading Passage?",
"questions": questions,
"type": "trueFalse"
}
async def _gen_write_blanks_exercise(self, text: str, quantity: int, start_id, difficulty):
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"questions": [{"question": question, "possible_answers": ["answer_1", "answer_2"]}]}')
},
{
"role": "user",
"content": (
f'Generate {str(quantity)} {difficulty} difficulty short answer questions, and the '
f'possible answers, must have maximum 3 words per answer, about this text:\n"{text}"'
)
}
]
response = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["questions"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
questions = response["questions"][:quantity]
return {
"id": str(uuid.uuid4()),
"maxWords": 3,
"prompt": "Choose no more than three words and/or a number from the passage for each answer.",
"solutions": ExercisesHelper.build_write_blanks_solutions(questions, start_id),
"text": ExercisesHelper.build_write_blanks_text(questions, start_id),
"type": "writeBlanks"
}
async def _gen_paragraph_match_exercise(self, text: str, quantity: int, start_id):
paragraphs = ExercisesHelper.assign_letters_to_paragraphs(text)
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"headings": [ {"heading": "first paragraph heading"}, {"heading": "second paragraph heading"}]}')
},
{
"role": "user",
"content": (
'For every paragraph of the list generate a minimum 5 word heading for it. '
f'The paragraphs are these: {str(paragraphs)}'
)
}
]
response = await self._llm.prediction(
GPTModels.GPT_4_O, messages, ["headings"], TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
headings = response["headings"]
options = []
for i, paragraph in enumerate(paragraphs, start=0):
paragraph["heading"] = headings[i]
options.append({
"id": paragraph["letter"],
"sentence": paragraph["paragraph"]
})
random.shuffle(paragraphs)
sentences = []
for i, paragraph in enumerate(paragraphs, start=start_id):
sentences.append({
"id": i,
"sentence": paragraph["heading"],
"solution": paragraph["letter"]
})
return {
"id": str(uuid.uuid4()),
"allowRepetition": False,
"options": options,
"prompt": "Choose the correct heading for paragraphs from the list of headings below.",
"sentences": sentences[:quantity],
"type": "matchSentences"
}

View File

@@ -0,0 +1,521 @@
import logging
import os
import re
import uuid
import random
from typing import Dict, List
from app.repositories.abc import IFileStorage, IDocumentStore
from app.services.abc import ISpeakingService, ILLMService, IVideoGeneratorService, ISpeechToTextService
from app.configs.constants import (
FieldsAndExercises, GPTModels, TemperatureSettings,
AvatarEnum, FilePaths
)
from app.helpers import TextHelper
class SpeakingService(ISpeakingService):
def __init__(
self, llm: ILLMService, vid_gen: IVideoGeneratorService,
file_storage: IFileStorage, document_store: IDocumentStore,
stt: ISpeechToTextService
):
self._llm = llm
self._vid_gen = vid_gen
self._file_storage = file_storage
self._document_store = document_store
self._stt = stt
self._logger = logging.getLogger(__name__)
self._tasks = {
"task_1": {
"get": {
"json_template": (
'{"topic": "topic", "question": "question"}'
),
"prompt": (
'Craft a thought-provoking question of {difficulty} difficulty for IELTS Speaking Part 1 '
'that encourages candidates to delve deeply into personal experiences, preferences, or '
'insights on the topic of "{topic}". Instruct the candidate to offer not only detailed '
'descriptions but also provide nuanced explanations, examples, or anecdotes to enrich '
'their response. Make sure that the generated question does not contain forbidden subjects in '
'muslim countries.'
)
}
},
"task_2": {
"get": {
"json_template": (
'{"topic": "topic", "question": "question", "prompts": ["prompt_1", "prompt_2", "prompt_3"]}'
),
"prompt": (
'Create a question of {difficulty} difficulty for IELTS Speaking Part 2 '
'that encourages candidates to narrate a personal experience or story related to the topic '
'of "{topic}". Include 3 prompts that guide the candidate to describe '
'specific aspects of the experience, such as details about the situation, '
'their actions, and the reasons it left a lasting impression. Make sure that the '
'generated question does not contain forbidden subjects in muslim countries.'
)
}
},
"task_3": {
"get": {
"json_template": (
'{"topic": "topic", "questions": ["question", "question", "question"]}'
),
"prompt": (
'Formulate a set of 3 questions of {difficulty} difficulty for IELTS Speaking Part 3 '
'that encourage candidates to engage in a meaningful discussion on the topic of "{topic}". '
'Provide inquiries, ensuring they explore various aspects, perspectives, and implications '
'related to the topic. Make sure that the generated question does not contain forbidden '
'subjects in muslim countries.'
)
}
},
}
async def get_speaking_task(self, task_id: int, topic: str, difficulty: str):
task_values = self._tasks[f'task_{task_id}']['get']
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: ' +
task_values["json_template"]
)
},
{
"role": "user",
"content": str(task_values["prompt"]).format(topic=topic, difficulty=difficulty)
}
]
response = await self._llm.prediction(
GPTModels.GPT_4_O, messages, FieldsAndExercises.GEN_FIELDS, TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
# TODO: this was on GET /speaking_task_3 don't know if it is intentional only for 3
if task_id == 3:
# Remove the numbers from the questions only if the string starts with a number
response["questions"] = [
re.sub(r"^\d+\.\s*", "", question)
if re.match(r"^\d+\.", question) else question
for question in response["questions"]
]
response["type"] = task_id
response["difficulty"] = difficulty
response["topic"] = topic
return response
async def grade_speaking_task_1_and_2(
self, task: int, question: str, answer_firebase_path: str, sound_file_name: str
):
request_id = uuid.uuid4()
req_data = {
"question": question,
"answer": answer_firebase_path
}
self._logger.info(
f'POST - speaking_task_{task} - Received request to grade speaking task {task}. '
f'Use this id to track the logs: {str(request_id)} - Request data: {str(req_data)}'
)
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Downloading file {answer_firebase_path}')
await self._file_storage.download_firebase_file(answer_firebase_path, sound_file_name)
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Downloaded file {answer_firebase_path} to {sound_file_name}')
answer = await self._stt.speech_to_text(sound_file_name)
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Transcripted answer: {answer}')
if TextHelper.has_x_words(answer, 20):
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"comment": "comment about answer quality", "overall": 0.0, '
'"task_response": {"Fluency and Coherence": 0.0, "Lexical Resource": 0.0, '
'"Grammatical Range and Accuracy": 0.0, "Pronunciation": 0.0}}')
},
{
"role": "user",
"content": (
f'Evaluate the given Speaking Part {task} response based on the IELTS grading system, ensuring a '
'strict assessment that penalizes errors. Deduct points for deviations from the task, and '
'assign a score of 0 if the response fails to address the question. Additionally, provide '
'detailed commentary highlighting both strengths and weaknesses in the response.'
f'\n Question: "{question}" \n Answer: "{answer}"')
}
]
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Requesting grading of the answer.')
response = await self._llm.prediction(
GPTModels.GPT_3_5_TURBO,
messages,
["comment"],
TemperatureSettings.GRADING_TEMPERATURE
)
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Answer graded: {str(response)}')
perfect_answer_messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"answer": "perfect answer"}'
)
},
{
"role": "user",
"content": (
'Provide a perfect answer according to ielts grading system to the following '
f'Speaking Part {task} question: "{question}"')
}
]
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Requesting perfect answer.')
response = await self._llm.prediction(
GPTModels.GPT_3_5_TURBO,
perfect_answer_messages,
["answer"],
TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
response['perfect_answer'] = response["answer"]
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Perfect answer: ' + response['perfect_answer'])
response['transcript'] = answer
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Requesting fixed text.')
response['fixed_text'] = await self._get_speaking_corrections(answer)
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Fixed text: ' + response['fixed_text'])
if response["overall"] == "0.0" or response["overall"] == 0.0:
response["overall"] = self._calculate_overall(response)
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Final response: {str(response)}')
return response
else:
self._logger.info(
f'POST - speaking_task_{task} - {str(request_id)} - '
f'The answer had less words than threshold 20 to be graded. Answer: {answer}'
)
return self._zero_rating("The audio recorded does not contain enough english words to be graded.")
# TODO: When there's more time grade_speaking_task_1_2 can be merged with this, when there's more time
async def grade_speaking_task_3(self, answers: Dict, task: int = 3):
request_id = uuid.uuid4()
self._logger.info(
f'POST - speaking_task_{task} - Received request to grade speaking task {task}. '
f'Use this id to track the logs: {str(request_id)} - Request data: {str(answers)}'
)
text_answers = []
perfect_answers = []
self._logger.info(
f'POST - speaking_task_{task} - {str(request_id)} - Received {str(len(answers))} total answers.'
)
for item in answers:
sound_file_name = FilePaths.AUDIO_FILES_PATH + str(uuid.uuid4())
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Downloading file {item["answer"]}')
await self._file_storage.download_firebase_file(item["answer"], sound_file_name)
self._logger.info(
f'POST - speaking_task_{task} - {str(request_id)} - '
'Downloaded file ' + item["answer"] + f' to {sound_file_name}'
)
answer_text = await self._stt.speech_to_text(sound_file_name)
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Transcripted answer: {answer_text}')
text_answers.append(answer_text)
item["answer"] = answer_text
os.remove(sound_file_name)
if not TextHelper.has_x_words(answer_text, 20):
self._logger.info(
f'POST - speaking_task_{task} - {str(request_id)} - '
f'The answer had less words than threshold 20 to be graded. Answer: {answer_text}')
return self._zero_rating("The audio recorded does not contain enough english words to be graded.")
perfect_answer_messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"answer": "perfect answer"}'
)
},
{
"role": "user",
"content": (
'Provide a perfect answer according to ielts grading system to the following '
f'Speaking Part {task} question: "{item["question"]}"'
)
}
]
self._logger.info(
f'POST - speaking_task_{task} - {str(request_id)} - '
f'Requesting perfect answer for question: {item["question"]}'
)
perfect_answers.append(
await self._llm.prediction(
GPTModels.GPT_3_5_TURBO,
perfect_answer_messages,
["answer"],
TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
)
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"comment": "comment about answer quality", "overall": 0.0, '
'"task_response": {"Fluency and Coherence": 0.0, "Lexical Resource": 0.0, '
'"Grammatical Range and Accuracy": 0.0, "Pronunciation": 0.0}}')
}
]
message = (
f"Evaluate the given Speaking Part {task} response based on the IELTS grading system, ensuring a "
"strict assessment that penalizes errors. Deduct points for deviations from the task, and "
"assign a score of 0 if the response fails to address the question. Additionally, provide detailed "
"commentary highlighting both strengths and weaknesses in the response."
"\n\n The questions and answers are: \n\n'")
self._logger.info(
f'POST - speaking_task_{task} - {str(request_id)} - Formatting answers and questions for prompt.'
)
formatted_text = ""
for i, entry in enumerate(answers, start=1):
formatted_text += f"**Question {i}:**\n{entry['question']}\n\n"
formatted_text += f"**Answer {i}:**\n{entry['answer']}\n\n"
self._logger.info(
f'POST - speaking_task_{task} - {str(request_id)} - Formatted answers and questions for prompt: {formatted_text}'
)
message += formatted_text
messages.append({
"role": "user",
"content": message
})
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Requesting grading of the answers.')
response = await self._llm.prediction(
GPTModels.GPT_3_5_TURBO, messages, ["comment"], TemperatureSettings.GRADING_TEMPERATURE
)
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Answers graded: {str(response)}')
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Adding perfect answers to response.')
for i, answer in enumerate(perfect_answers, start=1):
response['perfect_answer_' + str(i)] = answer
self._logger.info(
f'POST - speaking_task_{task} - {str(request_id)} - Adding transcript and fixed texts to response.'
)
for i, answer in enumerate(text_answers, start=1):
response['transcript_' + str(i)] = answer
response['fixed_text_' + str(i)] = await self._get_speaking_corrections(answer)
if response["overall"] == "0.0" or response["overall"] == 0.0:
response["overall"] = self._calculate_overall(response)
self._logger.info(f'POST - speaking_task_{task} - {str(request_id)} - Final response: {str(response)}')
return response
# ==================================================================================================================
# grade_speaking_task helpers
# ==================================================================================================================
@staticmethod
def _zero_rating(comment: str):
return {
"comment": comment,
"overall": 0,
"task_response": {
"Fluency and Coherence": 0,
"Lexical Resource": 0,
"Grammatical Range and Accuracy": 0,
"Pronunciation": 0
}
}
@staticmethod
def _calculate_overall(response: Dict):
return round(
(
response["task_response"]["Fluency and Coherence"] +
response["task_response"]["Lexical Resource"] +
response["task_response"]["Grammatical Range and Accuracy"] +
response["task_response"]["Pronunciation"]
) / 4, 1
)
async def _get_speaking_corrections(self, text):
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"fixed_text": "fixed transcription with no misspelling errors"}'
)
},
{
"role": "user",
"content": (
'Fix the errors in the provided transcription and put it in a JSON. '
f'Do not complete the answer, only replace what is wrong. \n The text: "{text}"'
)
}
]
response = await self._llm.prediction(
GPTModels.GPT_3_5_TURBO,
messages,
["fixed_text"],
0.2,
False
)
return response["fixed_text"]
async def create_videos_and_save_to_db(self, exercises, template, req_id):
template = await self._create_video_per_part(exercises, template, 1)
template = await self._create_video_per_part(exercises, template, 2)
template = await self._create_video_per_part(exercises, template, 3)
await self._document_store.save_to_db_with_id("speaking", template, req_id)
self._logger.info(f'Saved speaking to DB with id {req_id} : {str(template)}')
async def _create_video_per_part(self, exercises: List[Dict], template: Dict, part: int):
template_index = part - 1
# Using list comprehension to find the element with the desired value in the 'type' field
found_exercises = [element for element in exercises if element.get('type') == part]
# Check if any elements were found
if found_exercises:
exercise = found_exercises[0]
self._logger.info(f'Creating video for speaking part {part}')
if part in {1, 2}:
result = await self._create_video(
exercise["question"],
(random.choice(list(AvatarEnum))).value,
f'Failed to create video for part {part} question: {str(exercise["question"])}'
)
if result is not None:
if part == 2:
template["exercises"][template_index]["prompts"] = exercise["prompts"]
template["exercises"][template_index]["text"] = exercise["question"]
template["exercises"][template_index]["title"] = exercise["topic"]
template["exercises"][template_index]["video_url"] = result["video_url"]
template["exercises"][template_index]["video_path"] = result["video_path"]
else:
questions = []
for question in exercise["questions"]:
result = await self._create_video(
question,
(random.choice(list(AvatarEnum))).value,
f'Failed to create video for part {part} question: {str(exercise["question"])}'
)
if result is not None:
video = {
"text": question,
"video_path": result["video_path"],
"video_url": result["video_url"]
}
questions.append(video)
template["exercises"][template_index]["prompts"] = questions
template["exercises"][template_index]["title"] = exercise["topic"]
if not found_exercises:
template["exercises"].pop(template_index)
return template
# TODO: Check if it is intended to log the original question
async def generate_speaking_video(self, original_question: str, topic: str, avatar: str, prompts: List[str]):
if len(prompts) > 0:
question = original_question + " In your answer you should consider: " + " ".join(prompts)
else:
question = original_question
error_msg = f'Failed to create video for part 1 question: {original_question}'
result = await self._create_video(
question,
avatar,
error_msg
)
if result is not None:
return {
"text": original_question,
"prompts": prompts,
"title": topic,
**result,
"type": "speaking",
"id": uuid.uuid4()
}
else:
return str(error_msg)
async def generate_interactive_video(self, questions: List[str], avatar: str, topic: str):
sp_questions = []
self._logger.info('Creating videos for speaking part 3')
for question in questions:
result = await self._create_video(
question,
avatar,
f'Failed to create video for part 3 question: {question}'
)
if result is not None:
video = {
"text": question,
**result
}
sp_questions.append(video)
return {
"prompts": sp_questions,
"title": topic,
"type": "interactiveSpeaking",
"id": uuid.uuid4()
}
async def _create_video(self, question: str, avatar: str, error_message: str):
result = await self._vid_gen.create_video(question, avatar)
if result is not None:
sound_file_path = FilePaths.VIDEO_FILES_PATH + result
firebase_file_path = FilePaths.FIREBASE_SPEAKING_VIDEO_FILES_PATH + result
url = await self._file_storage.upload_file_firebase_get_url(firebase_file_path, sound_file_path)
return {
"video_path": firebase_file_path,
"video_url": url
}
self._logger.error(error_message)
return None

View File

@@ -0,0 +1,13 @@
from .aws_polly import AWSPolly
from .heygen import Heygen
from .openai import OpenAI
from .whisper import OpenAIWhisper
from .gpt_zero import GPTZero
__all__ = [
"AWSPolly",
"Heygen",
"OpenAI",
"OpenAIWhisper",
"GPTZero"
]

View File

@@ -0,0 +1,87 @@
import random
from typing import Union
import aiofiles
from aiobotocore.client import BaseClient
from app.services.abc import ITextToSpeechService
from app.configs.constants import NeuralVoices
class AWSPolly(ITextToSpeechService):
def __init__(self, client: BaseClient):
self._client = client
async def synthesize_speech(self, text: str, voice: str, engine: str = "neural", output_format: str = "mp3"):
tts_response = await self._client.synthesize_speech(
Engine=engine,
Text=text,
OutputFormat=output_format,
VoiceId=voice
)
return await tts_response['AudioStream'].read()
async def text_to_speech(self, text: Union[list[str], str], file_name: str):
if isinstance(text, str):
audio_segments = await self._text_to_speech(text)
elif isinstance(text, list):
audio_segments = await self._conversation_to_speech(text)
else:
raise ValueError("Unsupported argument for text_to_speech")
final_message = await self.synthesize_speech(
"This audio recording, for the listening exercise, has finished.",
"Stephen"
)
# Add finish message
audio_segments.append(final_message)
# Combine the audio segments into a single audio file
combined_audio = b"".join(audio_segments)
# Save the combined audio to a single file
async with aiofiles.open(file_name, "wb") as f:
await f.write(combined_audio)
print("Speech segments saved to " + file_name)
async def _text_to_speech(self, text: str):
voice = random.choice(NeuralVoices.ALL_NEURAL_VOICES)['Id']
# Initialize an empty list to store audio segments
audio_segments = []
for part in self._divide_text(text):
audio_segments.append(await self.synthesize_speech(part, voice))
return audio_segments
async def _conversation_to_speech(self, conversation: list):
# Initialize an empty list to store audio segments
audio_segments = []
# Iterate through the text segments, convert to audio segments, and store them
for segment in conversation:
audio_segments.append(await self.synthesize_speech(segment["text"], segment["voice"]))
return audio_segments
@staticmethod
def _divide_text(text, max_length=3000):
if len(text) <= max_length:
return [text]
divisions = []
current_position = 0
while current_position < len(text):
next_position = min(current_position + max_length, len(text))
next_period_position = text.rfind('.', current_position, next_position)
if next_period_position != -1 and next_period_position > current_position:
divisions.append(text[current_position:next_period_position + 1])
current_position = next_period_position + 1
else:
# If no '.' found in the next chunk, split at max_length
divisions.append(text[current_position:next_position])
current_position = next_position
return divisions

View File

@@ -0,0 +1,52 @@
from logging import getLogger
from typing import Dict, Optional
from httpx import AsyncClient
from app.services.abc.third_parties.ai_detector import IAIDetectorService
class GPTZero(IAIDetectorService):
_GPT_ZERO_ENDPOINT = 'https://api.gptzero.me/v2/predict/text'
def __init__(self, client: AsyncClient, gpt_zero_key: str):
self._header = {
'x-api-key': gpt_zero_key
}
self._http_client = client
self._logger = getLogger(__name__)
async def run_detection(self, text: str):
data = {
'document': text,
'version': '',
'multilingual': False
}
response = await self._http_client.post(self._GPT_ZERO_ENDPOINT, headers=self._header, json=data)
if response.status_code != 200:
return None
return self._parse_detection(response.json())
def _parse_detection(self, response: Dict) -> Optional[Dict]:
try:
text_scan = response["documents"][0]
filtered_sentences = [
{
"sentence": item["sentence"],
"highlight_sentence_for_ai": item["highlight_sentence_for_ai"]
}
for item in text_scan["sentences"]
]
return {
"class_probabilities": text_scan["class_probabilities"],
"confidence_category": text_scan["confidence_category"],
"predicted_class": text_scan["predicted_class"],
"sentences": filtered_sentences
}
except Exception as e:
self._logger.error(f'Failed to parse GPT\'s Zero response: {str(e)}')
return None

View File

@@ -0,0 +1,90 @@
import asyncio
import os
import logging
import aiofiles
from httpx import AsyncClient
from app.services.abc import IVideoGeneratorService
class Heygen(IVideoGeneratorService):
# TODO: Not used, remove if not necessary
# CREATE_VIDEO_URL = 'https://api.heygen.com/v1/template.generate'
_GET_VIDEO_URL = 'https://api.heygen.com/v1/video_status.get'
def __init__(self, client: AsyncClient, heygen_token: str):
self._get_header = {
'X-Api-Key': heygen_token
}
self._post_header = {
'X-Api-Key': heygen_token,
'Content-Type': 'application/json'
}
self._http_client = client
self._logger = logging.getLogger(__name__)
async def create_video(self, text: str, avatar: str):
# POST TO CREATE VIDEO
create_video_url = 'https://api.heygen.com/v2/template/' + avatar + '/generate'
data = {
"test": False,
"caption": False,
"title": "video_title",
"variables": {
"script_here": {
"name": "script_here",
"type": "text",
"properties": {
"content": text
}
}
}
}
response = await self._http_client.post(create_video_url, headers=self._post_header, json=data)
self._logger.info(response.status_code)
self._logger.info(response.json())
# GET TO CHECK STATUS AND GET VIDEO WHEN READY
video_id = response.json()["data"]["video_id"]
params = {
'video_id': response.json()["data"]["video_id"]
}
response = {}
status = "processing"
error = None
while status != "completed" and error is None:
response = await self._http_client.get(self._GET_VIDEO_URL, headers=self._get_header, params=params)
response_data = response.json()
status = response_data["data"]["status"]
error = response_data["data"]["error"]
if status != "completed" and error is None:
self._logger.info(f"Status: {status}")
await asyncio.sleep(10) # Wait for 10 second before the next request
self._logger.info(response.status_code)
self._logger.info(response.json())
# DOWNLOAD VIDEO
download_url = response.json()['data']['video_url']
output_directory = 'download-video/'
output_filename = video_id + '.mp4'
response = await self._http_client.get(download_url)
if response.status_code == 200:
os.makedirs(output_directory, exist_ok=True) # Create the directory if it doesn't exist
output_path = os.path.join(output_directory, output_filename)
async with aiofiles.open(output_path, 'wb') as f:
await f.write(response.content)
self._logger.info(f"File '{output_filename}' downloaded successfully.")
return output_filename
else:
self._logger.error(f"Failed to download file. Status code: {response.status_code}")
return None

View File

@@ -0,0 +1,97 @@
import json
import re
import logging
from typing import List, Optional
from openai import AsyncOpenAI
from openai.types.chat import ChatCompletionMessageParam
from app.services.abc import ILLMService
from app.helpers import count_tokens
from app.configs.constants import BLACKLISTED_WORDS
class OpenAI(ILLMService):
MAX_TOKENS = 4097
TRY_LIMIT = 2
def __init__(self, client: AsyncOpenAI):
self._client = client
self._logger = logging.getLogger(__name__)
async def prediction(
self,
model: str,
messages: List[ChatCompletionMessageParam],
fields_to_check: Optional[List[str]],
temperature: float,
check_blacklisted: bool = True,
token_count: int = -1
):
if token_count == -1:
token_count = self._count_total_tokens(messages)
return await self._prediction(model, messages, token_count, fields_to_check, temperature, 0, check_blacklisted)
async def _prediction(
self,
model: str,
messages: List[ChatCompletionMessageParam],
token_count: int,
fields_to_check: Optional[List[str]],
temperature: float,
try_count: int,
check_blacklisted: bool,
):
result = await self._client.chat.completions.create(
model=model,
max_tokens=int(self.MAX_TOKENS - token_count - 300),
temperature=float(temperature),
messages=messages,
response_format={"type": "json_object"}
)
result = result.choices[0].message.content
if check_blacklisted:
found_blacklisted_word = self._get_found_blacklisted_words(result)
if found_blacklisted_word is not None and try_count < self.TRY_LIMIT:
self._logger.warning("Result contains blacklisted words: " + str(found_blacklisted_word))
return await self._prediction(
model, messages, token_count, fields_to_check, temperature, (try_count + 1), check_blacklisted
)
elif found_blacklisted_word is not None and try_count >= self.TRY_LIMIT:
return ""
if fields_to_check is None:
return json.loads(result)
if not self._check_fields(result, fields_to_check) and try_count < self.TRY_LIMIT:
return await self._prediction(
model, messages, token_count, fields_to_check, temperature, (try_count + 1), check_blacklisted
)
return json.loads(result)
async def prediction_override(self, **kwargs):
return await self._client.chat.completions.create(
**kwargs
)
@staticmethod
def _get_found_blacklisted_words(text: str):
text_lower = text.lower()
for word in BLACKLISTED_WORDS:
if re.search(r'\b' + re.escape(word) + r'\b', text_lower):
return word
return None
@staticmethod
def _count_total_tokens(messages):
total_tokens = 0
for message in messages:
total_tokens += count_tokens(message["content"])["n_tokens"]
return total_tokens
@staticmethod
def _check_fields(obj, fields):
return all(field in obj for field in fields)

View File

@@ -0,0 +1,22 @@
import os
from fastapi.concurrency import run_in_threadpool
from whisper import Whisper
from app.services.abc import ISpeechToTextService
class OpenAIWhisper(ISpeechToTextService):
def __init__(self, model: Whisper):
self._model = model
async def speech_to_text(self, file_path):
if os.path.exists(file_path):
result = await run_in_threadpool(
self._model.transcribe, file_path, fp16=False, language='English', verbose=False
)
return result["text"]
else:
print("File not found:", file_path)
raise Exception("File " + file_path + " not found.")

View File

@@ -0,0 +1,68 @@
import re
from functools import reduce
from app.configs.constants import TemperatureSettings, GPTModels
from app.helpers import count_tokens
from app.services.abc import ILLMService, ITrainingService
class TrainingService(ITrainingService):
def __init__(self, llm: ILLMService):
self._llm = llm
async def fetch_tips(self, context: str, question: str, answer: str, correct_answer: str):
messages = self._get_question_tips(question, answer, correct_answer, context)
token_count = reduce(lambda count, item: count + count_tokens(item)['n_tokens'],
map(lambda x: x["content"], filter(lambda x: "content" in x, messages)), 0)
response = await self._llm.prediction(
GPTModels.GPT_3_5_TURBO,
messages,
None,
TemperatureSettings.TIPS_TEMPERATURE,
token_count=token_count
)
if isinstance(response, str):
response = re.sub(r"^[a-zA-Z0-9_]+\:\s*", "", response)
return response
@staticmethod
def _get_question_tips(question: str, answer: str, correct_answer: str, context: str = None):
messages = [
{
"role": "user",
"content": (
"You are a IELTS exam program that analyzes incorrect answers to questions and gives tips to "
"help students understand why it was a wrong answer and gives helpful insight for the future. "
"The tip should refer to the context and question."
),
}
]
if not (context is None or context == ""):
messages.append({
"role": "user",
"content": f"This is the context for the question: {context}",
})
messages.extend([
{
"role": "user",
"content": f"This is the question: {question}",
},
{
"role": "user",
"content": f"This is the answer: {answer}",
},
{
"role": "user",
"content": f"This is the correct answer: {correct_answer}",
}
])
return messages

View File

@@ -0,0 +1,147 @@
from app.services.abc import IWritingService, ILLMService, IAIDetectorService
from app.configs.constants import GPTModels, TemperatureSettings
from app.helpers import TextHelper, ExercisesHelper
class WritingService(IWritingService):
def __init__(self, llm: ILLMService, ai_detector: IAIDetectorService):
self._llm = llm
self._ai_detector = ai_detector
async def get_writing_task_general_question(self, task: int, topic: str, difficulty: str):
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: {"prompt": "prompt content"}'
)
},
{
"role": "user",
"content": self._get_writing_prompt(task, topic, difficulty)
}
]
llm_model = GPTModels.GPT_3_5_TURBO if task == 1 else GPTModels.GPT_4_O
response = await self._llm.prediction(
llm_model,
messages,
["prompt"],
TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
return {
"question": response["prompt"].strip(),
"difficulty": difficulty,
"topic": topic
}
@staticmethod
def _get_writing_prompt(task: int, topic: str, difficulty: str):
return (
'Craft a prompt for an IELTS Writing Task 1 General Training exercise that instructs the '
'student to compose a letter. The prompt should present a specific scenario or situation, '
f'based on the topic of "{topic}", requiring the student to provide information, '
'advice, or instructions within the letter. Make sure that the generated prompt is '
f'of {difficulty} difficulty and does not contain forbidden subjects in muslim countries.'
) if task == 1 else (
f'Craft a comprehensive question of {difficulty} difficulty like the ones for IELTS '
'Writing Task 2 General Training that directs the candidate to delve into an in-depth '
f'analysis of contrasting perspectives on the topic of "{topic}".'
)
async def grade_writing_task(self, task: int, question: str, answer: str):
bare_minimum = 100 if task == 1 else 180
minimum = 150 if task == 1 else 250
# TODO: left as is, don't know if this is intended or not
llm_model = GPTModels.GPT_3_5_TURBO if task == 1 else GPTModels.GPT_4_O
temperature = (
TemperatureSettings.GRADING_TEMPERATURE
if task == 1 else
TemperatureSettings.GEN_QUESTION_TEMPERATURE
)
if not TextHelper.has_words(answer):
return self._zero_rating("The answer does not contain enough english words.")
elif not TextHelper.has_x_words(answer, bare_minimum):
return self._zero_rating("The answer is insufficient and too small to be graded.")
else:
messages = [
{
"role": "system",
"content": (
'You are a helpful assistant designed to output JSON on this format: '
'{"perfect_answer": "example perfect answer", "comment": '
'"comment about answer quality", "overall": 0.0, "task_response": '
'{"Task Achievement": 0.0, "Coherence and Cohesion": 0.0, '
'"Lexical Resource": 0.0, "Grammatical Range and Accuracy": 0.0 }'
)
},
{
"role": "user",
"content": (
f'Evaluate the given Writing Task {task} response based on the IELTS grading system, '
'ensuring a strict assessment that penalizes errors. Deduct points for deviations '
'from the task, and assign a score of 0 if the response fails to address the question. '
f'Additionally, provide an exemplary answer with a minimum of {minimum} words, along with a '
'detailed commentary highlighting both strengths and weaknesses in the response. '
f'\n Question: "{question}" \n Answer: "{answer}"')
},
{
"role": "user",
"content": f'The perfect answer must have at least {minimum} words.'
}
]
response = await self._llm.prediction(
llm_model,
messages,
["comment"],
temperature
)
response["overall"] = ExercisesHelper.fix_writing_overall(response["overall"], response["task_response"])
response['fixed_text'] = await self._get_fixed_text(answer)
ai_detection = await self._ai_detector.run_detection(answer)
if ai_detection is not None:
response['ai_detection'] = ai_detection
return response
async def _get_fixed_text(self, text):
messages = [
{"role": "system", "content": ('You are a helpful assistant designed to output JSON on this format: '
'{"fixed_text": "fixed test with no misspelling errors"}')
},
{"role": "user", "content": (
'Fix the errors in the given text and put it in a JSON. '
f'Do not complete the answer, only replace what is wrong. \n The text: "{text}"')
}
]
response = await self._llm.prediction(
GPTModels.GPT_3_5_TURBO,
messages,
["fixed_text"],
0.2,
False
)
return response["fixed_text"]
@staticmethod
def _zero_rating(comment: str):
return {
'comment': comment,
'overall': 0,
'task_response': {
'Coherence and Cohesion': 0,
'Grammatical Range and Accuracy': 0,
'Lexical Resource': 0,
'Task Achievement': 0
}
}

View File

@@ -6,5 +6,5 @@ services:
build: . build: .
image: ecrop/ielts-be:latest image: ecrop/ielts-be:latest
ports: ports:
- 8080:5000 - 8080:8000
restart: unless-stopped restart: unless-stopped

View File

@@ -1,441 +0,0 @@
from enum import Enum
from typing import List
class QuestionType(Enum):
LISTENING_SECTION_1 = "Listening Section 1"
LISTENING_SECTION_2 = "Listening Section 2"
LISTENING_SECTION_3 = "Listening Section 3"
LISTENING_SECTION_4 = "Listening Section 4"
WRITING_TASK_1 = "Writing Task 1"
WRITING_TASK_2 = "Writing Task 2"
SPEAKING_1 = "Speaking Task Part 1"
SPEAKING_2 = "Speaking Task Part 2"
READING_PASSAGE_1 = "Reading Passage 1"
READING_PASSAGE_2 = "Reading Passage 2"
READING_PASSAGE_3 = "Reading Passage 3"
def get_grading_messages(question_type: QuestionType, question: str, answer: str, context: str = None):
if QuestionType.WRITING_TASK_1 == question_type:
messages = [
{
"role": "user",
"content": "You are a IELTS examiner.",
},
{
"role": "user",
"content": f"The question you have to grade is of type Writing Task 1 and is the following: {question}",
}
]
if not (context is None or context == ""):
messages.append({
"role": "user",
"content": f"To grade the previous question, bear in mind the following context: {context}",
})
messages.extend([
{
"role": "user",
"content": "It is mandatory for you to provide your response with the overall grade and breakdown grades, "
"with just the following json format: {'comment': 'comment about answer quality', 'overall': 7.0, "
"'task_response': {'Task Achievement': 8.0, 'Coherence and Cohesion': 6.5, 'Lexical Resource': 7.5, "
"'Grammatical Range and Accuracy': 6.0}}",
},
{
"role": "user",
"content": "Example output: { 'comment': 'Overall, the response is good but there are some areas that need "
"improvement.\n\nIn terms of Task Achievement, the writer has addressed all parts of the question "
"and has provided a clear opinion on the topic. However, some of the points made are not fully "
"developed or supported with examples.\n\nIn terms of Coherence and Cohesion, there is a clear "
"structure to the response with an introduction, body paragraphs and conclusion. However, there "
"are some issues with cohesion as some sentences do not flow smoothly from one to another.\n\nIn "
"terms of Lexical Resource, there is a good range of vocabulary used throughout the response and "
"some less common words have been used effectively.\n\nIn terms of Grammatical Range and Accuracy, "
"there are some errors in grammar and sentence structure which affect clarity in places.\n\nOverall, "
"this response would score a band 6.5.', 'overall': 6.5, 'task_response': "
"{ 'Coherence and Cohesion': 6.5, 'Grammatical Range and Accuracy': 6.0, 'Lexical Resource': 7.0, "
"'Task Achievement': 7.0}}",
},
{
"role": "user",
"content": f"Evaluate this answer according to ielts grading system: {answer}",
},
])
return messages
elif QuestionType.WRITING_TASK_2 == question_type:
return [
{
"role": "user",
"content": "You are a IELTS examiner.",
},
{
"role": "user",
"content": f"The question you have to grade is of type Writing Task 2 and is the following: {question}",
},
{
"role": "user",
"content": "It is mandatory for you to provide your response with the overall grade and breakdown grades, "
"with just the following json format: {'comment': 'comment about answer quality', 'overall': 7.0, "
"'task_response': {'Task Achievement': 8.0, 'Coherence and Cohesion': 6.5, 'Lexical Resource': 7.5, "
"'Grammatical Range and Accuracy': 6.0}}",
},
{
"role": "user",
"content": "Example output: { 'comment': 'Overall, the response is good but there are some areas that need "
"improvement.\n\nIn terms of Task Achievement, the writer has addressed all parts of the question "
"and has provided a clear opinion on the topic. However, some of the points made are not fully "
"developed or supported with examples.\n\nIn terms of Coherence and Cohesion, there is a clear "
"structure to the response with an introduction, body paragraphs and conclusion. However, there "
"are some issues with cohesion as some sentences do not flow smoothly from one to another.\n\nIn "
"terms of Lexical Resource, there is a good range of vocabulary used throughout the response and "
"some less common words have been used effectively.\n\nIn terms of Grammatical Range and Accuracy, "
"there are some errors in grammar and sentence structure which affect clarity in places.\n\nOverall, "
"this response would score a band 6.5.', 'overall': 6.5, 'task_response': "
"{ 'Coherence and Cohesion': 6.5, 'Grammatical Range and Accuracy': 6.0, 'Lexical Resource': 7.0, "
"'Task Achievement': 7.0}}",
},
{
"role": "user",
"content": f"Evaluate this answer according to ielts grading system: {answer}",
},
]
elif QuestionType.SPEAKING_1 == question_type:
return [
{
"role": "user",
"content": "You are an IELTS examiner."
},
{
"role": "user",
"content": f"The question you need to grade is a Speaking Task Part 1 question, and it is as follows: {question}"
},
{
"role": "user",
"content": "Please provide your assessment using the following JSON format: {'comment': 'Comment about answer "
"quality will go here', 'overall': 7.0, 'task_response': {'Fluency and "
"Coherence': 8.0, 'Lexical Resource': 6.5, 'Grammatical Range and Accuracy': 7.5, 'Pronunciation': 6.0}}"
},
{
"role": "user",
"content": "Example output: {'comment': 'Comment about answer quality will go here', 'overall': 6.5, "
"'task_response': {'Fluency and Coherence': 7.0, "
"'Lexical Resource': 6.5, 'Grammatical Range and Accuracy': 7.0, 'Pronunciation': 6.0}}"
},
{
"role": "user",
"content": "Please assign a grade of 0 if the answer provided does not address the question."
},
{
"role": "user",
"content": f"Assess this answer according to the IELTS grading system: {answer}"
},
{
"role": "user",
"content": "Remember to consider Fluency and Coherence, Lexical Resource, Grammatical Range and Accuracy, "
"and Pronunciation when grading the response."
}
]
elif QuestionType.SPEAKING_2 == question_type:
return [
{
"role": "user",
"content": "You are an IELTS examiner."
},
{
"role": "user",
"content": f"The question you need to grade is a Speaking Task Part 2 question, and it is as follows: {question}"
},
{
"role": "user",
"content": "Please provide your assessment using the following JSON format: {\"comment\": \"Comment about "
"answer quality\", \"overall\": 7.0, \"task_response\": {\"Fluency and Coherence\": 8.0, \"Lexical "
"Resource\": 6.5, \"Grammatical Range and Accuracy\": 7.5, \"Pronunciation\": 6.0}}"
},
{
"role": "user",
"content": "Example output: {\"comment\": \"The candidate has provided a clear response to the question "
"and has given examples of how they spend their weekends. However, there are some issues with "
"grammar and pronunciation that affect the overall score. In terms of fluency and coherence, "
"the candidate speaks clearly and smoothly with only minor hesitations. They have also provided "
"a well-organized response that is easy to follow. Regarding lexical resource, the candidate "
"has used a range of vocabulary related to weekend activities but there are some errors in "
"word choice that affect the meaning of their sentences. In terms of grammatical range and "
"accuracy, the candidate has used a mix of simple and complex sentence structures but there "
"are some errors in subject-verb agreement and preposition use. Finally, regarding pronunciation, "
"the candidate's speech is generally clear but there are some issues with stress and intonation "
"that make it difficult to understand at times.\", \"overall\": 6.5, \"task_response\": {\"Fluency "
"and Coherence\": 7.0, \"Lexical Resource\": 6.5, \"Grammatical Range and Accuracy\": 7.0, "
"\"Pronunciation\": 6.0}}"
},
{
"role": "user",
"content": "Please assign a grade of 0 if the answer provided does not address the question."
},
{
"role": "user",
"content": f"Assess this answer according to the IELTS grading system: {answer}"
},
{
"role": "user",
"content": "Remember to consider Fluency and Coherence, Lexical Resource, Grammatical Range and Accuracy, "
"and Pronunciation when grading the response."
}
]
else:
raise Exception("Question type not implemented: " + question_type.value)
def get_speaking_grading_messages(answers: List):
messages = [
{
"role": "user",
"content": "You are an IELTS examiner."
},
{
"role": "user",
"content": "The exercise you need to grade is a Speaking Task, and it is has the following questions and answers:"
}
]
for item in answers:
question = item["question"]
answer = item["answer_text"]
messages.append({
"role": "user",
"content": f"Question: {question}; Answer: {answer}"
})
messages.extend([
{
"role": "user",
"content": f"Assess this answer according to the IELTS grading system."
},
{
"role": "user",
"content": "Please provide your assessment using the following JSON format: {'comment': 'Comment about answer "
"quality will go here', 'overall': 7.0, 'task_response': {'Fluency and "
"Coherence': 8.0, 'Lexical Resource': 6.5, 'Grammatical Range and Accuracy': 7.5, 'Pronunciation': 6.0}}"
},
{
"role": "user",
"content": "Example output: {'comment': 'Comment about answer quality will go here', 'overall': 6.5, "
"'task_response': {'Fluency and Coherence': 7.0, "
"'Lexical Resource': 6.5, 'Grammatical Range and Accuracy': 7.0, 'Pronunciation': 6.0}}"
},
{
"role": "user",
"content": "Please assign a grade of 0 if the answer provided does not address the question."
},
{
"role": "user",
"content": "Remember to consider Fluency and Coherence, Lexical Resource, Grammatical Range and Accuracy, "
"and Pronunciation when grading the response."
}
])
return messages
def get_question_gen_messages(question_type: QuestionType):
if QuestionType.LISTENING_SECTION_1 == question_type:
return [
{
"role": "user",
"content": "You are a IELTS program that generates questions for the exams.",
},
{
"role": "user",
"content": "Provide me with a transcript similar to the ones in ielts exam Listening Section 1. "
"Create an engaging transcript simulating a conversation related to a unique type of service "
"that requires getting the customer's details. Make sure to include specific details "
"and descriptions to bring"
"the scenario to life. After the transcript, please "
"generate a 'form like' fill in the blanks exercise with 6 form fields (ex: name, date of birth)"
" to fill related to the customer's details. Finally, "
"provide the answers for the exercise. The response must be a json following this format: "
"{ 'type': '<type of registration (ex: hotel, gym, english course, etc)>', "
"'transcript': '<transcript of just the conversation about a registration of some sort, "
"identify the person talking in each speech line>', "
"'exercise': { 'form field': { '1': '<form field 1>', '2': '<form field 2>', "
"'3': '<form field 3>', '4': '<form field 4>', "
"'5': '<form field 5>', '6': '<form field 5>' }, "
"'answers': {'1': '<answer to fill blank space in form field 1>', '2': '<answer to fill blank "
"space in form field 2>', '3': '<answer to fill blank space in form field 3>', "
"'4': '<answer to fill blank space in form field 4>', '5': '<answer to fill blank space in form field 5>',"
" '6': '<answer to fill blank space in form field 6>'}}}",
}
]
elif QuestionType.LISTENING_SECTION_2 == question_type:
return [
{
"role": "user",
"content": "You are a IELTS program that generates questions for the exams.",
},
{
"role": "user",
"content": "Provide me with a transcript similar to the ones in ielts exam Listening section 2. After the transcript, please "
"generate a fill in the blanks exercise with 6 statements related to the text content. Finally, "
"provide the answers for the exercise. The response must be a json following this format: "
"{ 'transcript': 'transcript about some subject', 'exercise': { 'statements': { '1': 'statement 1 "
"with a blank space to fill', '2': 'statement 2 with a blank space to fill', '3': 'statement 3 with a "
"blank space to fill', '4': 'statement 4 with a blank space to fill', '5': 'statement 5 with a blank "
"space to fill', '6': 'statement 6 with a blank space to fill' }, "
"'answers': {'1': 'answer to fill blank space in statement 1', '2': 'answer to fill blank "
"space in statement 2', '3': 'answer to fill blank space in statement 3', "
"'4': 'answer to fill blank space in statement 4', '5': 'answer to fill blank space in statement 5',"
" '6': 'answer to fill blank space in statement 6'}}}",
}
]
elif QuestionType.LISTENING_SECTION_3 == question_type:
return [
{
"role": "user",
"content": "You are a IELTS program that generates questions for the exams.",
},
{
"role": "user",
"content": "Provide me with a transcript similar to the ones in ielts exam Listening section 3. After the transcript, please "
"generate 4 multiple choice questions related to the text content. Finally, "
"provide the answers for the exercise. The response must be a json following this format: "
"{ 'transcript': 'generated transcript similar to the ones in ielts exam Listening section 3', "
"'exercise': { 'questions': [ { 'question': "
"'question 1', 'options': ['option 1', 'option 2', 'option 3', 'option 4'], 'answer': 1}, "
"{'question': 'question 2', 'options': ['option 1', 'option 2', 'option 3', 'option 4'], "
"'answer': 3}, {'question': 'question 3', 'options': ['option 1', 'option 2', 'option 3', "
"'option 4'], 'answer': 0}, {'question': 'question 4', 'options': ['option 1', 'option 2', "
"'option 3', 'option 4'], 'answer': 2}]}}",
}
]
elif QuestionType.LISTENING_SECTION_4 == question_type:
return [
{
"role": "user",
"content": "You are a IELTS program that generates questions for the exams.",
},
{
"role": "user",
"content": "Provide me with a transcript similar to the ones in ielts exam Listening section 4. After the transcript, please "
"generate 4 completion-type questions related to the text content to complete with 1 word. Finally, "
"provide the answers for the exercise. The response must be a json following this format: "
"{ 'transcript': 'generated transcript similar to the ones in ielts exam Listening section 4', "
"'exercise': [ { 'question': 'question 1', 'answer': 'answer 1'}, "
"{'question': 'question 2', 'answer': 'answer 2'}, {'question': 'question 3', 'answer': 'answer 3'}, "
"{'question': 'question 4', 'answer': 'answer 4'}]}",
}
]
elif QuestionType.WRITING_TASK_2 == question_type:
return [
{
"role": "user",
"content": "You are a IELTS program that generates questions for the exams.",
},
{
"role": "user",
"content": "The question you have to generate is of type Writing Task 2.",
},
{
"role": "user",
"content": "It is mandatory for you to provide your response with the question "
"just with the following json format: {'question': 'question'}",
},
{
"role": "user",
"content": "Example output: { 'question': 'We are becoming increasingly dependent on computers. "
"They are used in businesses, hospitals, crime detection and even to fly planes. What things will "
"they be used for in the future? Is this dependence on computers a good thing or should we he more "
"auspicious of their benefits?'}",
},
{
"role": "user",
"content": "Generate a question for IELTS exam Writing Task 2.",
},
]
elif QuestionType.SPEAKING_1 == question_type:
return [
{
"role": "user",
"content": "You are a IELTS program that generates questions for the exams.",
},
{
"role": "user",
"content": "The question you have to generate is of type Speaking Task Part 1.",
},
{
"role": "user",
"content": "It is mandatory for you to provide your response with the question "
"just with the following json format: {'question': 'question'}",
},
{
"role": "user",
"content": "Example output: { 'question': 'Lets talk about your home town or village. "
"What kind of place is it? Whats the most interesting part of your town/village? "
"What kind of jobs do the people in your town/village do? "
"Would you say its a good place to live? (Why?)'}",
},
{
"role": "user",
"content": "Generate a question for IELTS exam Speaking Task.",
},
]
elif QuestionType.SPEAKING_2 == question_type:
return [
{
"role": "user",
"content": "You are a IELTS program that generates questions for the exams.",
},
{
"role": "user",
"content": "The question you have to generate is of type Speaking Task Part 2.",
},
{
"role": "user",
"content": "It is mandatory for you to provide your response with the question "
"just with the following json format: {'question': 'question'}",
},
{
"role": "user",
"content": "Example output: { 'question': 'Describe something you own which is very important to you. "
"You should say: where you got it from how long you have had it what you use it for and "
"explain why it is important to you.'}",
},
{
"role": "user",
"content": "Generate a question for IELTS exam Speaking Task.",
},
]
else:
raise Exception("Question type not implemented: " + question_type.value)
def get_question_tips(question: str, answer: str, correct_answer: str, context: str = None):
messages = [
{
"role": "user",
"content": "You are a IELTS exam program that analyzes incorrect answers to questions and gives tips to "
"help students understand why it was a wrong answer and gives helpful insight for the future. "
"The tip should refer to the context and question.",
}
]
if not (context is None or context == ""):
messages.append({
"role": "user",
"content": f"This is the context for the question: {context}",
})
messages.extend([
{
"role": "user",
"content": f"This is the question: {question}",
},
{
"role": "user",
"content": f"This is the answer: {answer}",
},
{
"role": "user",
"content": f"This is the correct answer: {correct_answer}",
}
])
return messages

View File

@@ -1,656 +0,0 @@
AUDIO_FILES_PATH = 'download-audio/'
FIREBASE_LISTENING_AUDIO_FILES_PATH = 'listening_recordings/'
VIDEO_FILES_PATH = 'download-video/'
FIREBASE_SPEAKING_VIDEO_FILES_PATH = 'speaking_videos/'
GRADING_TEMPERATURE = 0.1
TIPS_TEMPERATURE = 0.2
GEN_QUESTION_TEMPERATURE = 0.7
GPT_3_5_TURBO = "gpt-3.5-turbo"
GPT_4_TURBO = "gpt-4-turbo"
GPT_4_O = "gpt-4o"
GPT_3_5_TURBO_16K = "gpt-3.5-turbo-16k"
GPT_3_5_TURBO_INSTRUCT = "gpt-3.5-turbo-instruct"
GPT_4_PREVIEW = "gpt-4-turbo-preview"
GRADING_FIELDS = ['comment', 'overall', 'task_response']
GEN_FIELDS = ['topic']
GEN_TEXT_FIELDS = ['title']
LISTENING_GEN_FIELDS = ['transcript', 'exercise']
READING_EXERCISE_TYPES = ['fillBlanks', 'writeBlanks', 'trueFalse', 'paragraphMatch']
LISTENING_EXERCISE_TYPES = ['multipleChoice', 'writeBlanksQuestions', 'writeBlanksFill', 'writeBlanksForm']
TOTAL_READING_PASSAGE_1_EXERCISES = 13
TOTAL_READING_PASSAGE_2_EXERCISES = 13
TOTAL_READING_PASSAGE_3_EXERCISES = 14
TOTAL_LISTENING_SECTION_1_EXERCISES = 10
TOTAL_LISTENING_SECTION_2_EXERCISES = 10
TOTAL_LISTENING_SECTION_3_EXERCISES = 10
TOTAL_LISTENING_SECTION_4_EXERCISES = 10
LISTENING_MIN_TIMER_DEFAULT = 30
WRITING_MIN_TIMER_DEFAULT = 60
SPEAKING_MIN_TIMER_DEFAULT = 14
BLACKLISTED_WORDS = ["jesus", "sex", "gay", "lesbian", "homosexual", "god", "angel", "pornography", "beer", "wine",
"cocaine", "alcohol", "nudity", "lgbt", "casino", "gambling", "catholicism",
"discrimination", "politics", "politic", "christianity", "islam", "christian", "christians",
"jews", "jew", "discrimination", "discriminatory"]
EN_US_VOICES = [
{'Gender': 'Female', 'Id': 'Salli', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Salli',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Male', 'Id': 'Matthew', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Matthew',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Female', 'Id': 'Kimberly', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Kimberly',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Female', 'Id': 'Kendra', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Kendra',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Male', 'Id': 'Justin', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Justin',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Male', 'Id': 'Joey', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Joey',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Female', 'Id': 'Joanna', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Joanna',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Female', 'Id': 'Ivy', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Ivy',
'SupportedEngines': ['neural', 'standard']}]
EN_GB_VOICES = [
{'Gender': 'Female', 'Id': 'Emma', 'LanguageCode': 'en-GB', 'LanguageName': 'British English', 'Name': 'Emma',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Male', 'Id': 'Brian', 'LanguageCode': 'en-GB', 'LanguageName': 'British English', 'Name': 'Brian',
'SupportedEngines': ['neural', 'standard']},
{'Gender': 'Female', 'Id': 'Amy', 'LanguageCode': 'en-GB', 'LanguageName': 'British English', 'Name': 'Amy',
'SupportedEngines': ['neural', 'standard']}]
EN_GB_WLS_VOICES = [
{'Gender': 'Male', 'Id': 'Geraint', 'LanguageCode': 'en-GB-WLS', 'LanguageName': 'Welsh English', 'Name': 'Geraint',
'SupportedEngines': ['standard']}]
EN_AU_VOICES = [{'Gender': 'Male', 'Id': 'Russell', 'LanguageCode': 'en-AU', 'LanguageName': 'Australian English',
'Name': 'Russell', 'SupportedEngines': ['standard']},
{'Gender': 'Female', 'Id': 'Nicole', 'LanguageCode': 'en-AU', 'LanguageName': 'Australian English',
'Name': 'Nicole', 'SupportedEngines': ['standard']}]
ALL_VOICES = EN_US_VOICES + EN_GB_VOICES + EN_GB_WLS_VOICES + EN_AU_VOICES
NEURAL_EN_US_VOICES = [
{'Gender': 'Female', 'Id': 'Danielle', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Danielle',
'SupportedEngines': ['neural']},
{'Gender': 'Male', 'Id': 'Gregory', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Gregory',
'SupportedEngines': ['neural']},
{'Gender': 'Male', 'Id': 'Kevin', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Kevin',
'SupportedEngines': ['neural']},
{'Gender': 'Female', 'Id': 'Ruth', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Ruth',
'SupportedEngines': ['neural']},
{'Gender': 'Male', 'Id': 'Stephen', 'LanguageCode': 'en-US', 'LanguageName': 'US English', 'Name': 'Stephen',
'SupportedEngines': ['neural']}]
NEURAL_EN_GB_VOICES = [
{'Gender': 'Male', 'Id': 'Arthur', 'LanguageCode': 'en-GB', 'LanguageName': 'British English', 'Name': 'Arthur',
'SupportedEngines': ['neural']}]
NEURAL_EN_AU_VOICES = [
{'Gender': 'Female', 'Id': 'Olivia', 'LanguageCode': 'en-AU', 'LanguageName': 'Australian English',
'Name': 'Olivia', 'SupportedEngines': ['neural']}]
NEURAL_EN_ZA_VOICES = [
{'Gender': 'Female', 'Id': 'Ayanda', 'LanguageCode': 'en-ZA', 'LanguageName': 'South African English',
'Name': 'Ayanda', 'SupportedEngines': ['neural']}]
NEURAL_EN_NZ_VOICES = [
{'Gender': 'Female', 'Id': 'Aria', 'LanguageCode': 'en-NZ', 'LanguageName': 'New Zealand English', 'Name': 'Aria',
'SupportedEngines': ['neural']}]
NEURAL_EN_IN_VOICES = [
{'Gender': 'Female', 'Id': 'Kajal', 'LanguageCode': 'en-IN', 'LanguageName': 'Indian English', 'Name': 'Kajal',
'SupportedEngines': ['neural']}]
NEURAL_EN_IE_VOICES = [
{'Gender': 'Female', 'Id': 'Niamh', 'LanguageCode': 'en-IE', 'LanguageName': 'Irish English', 'Name': 'Niamh',
'SupportedEngines': ['neural']}]
ALL_NEURAL_VOICES = NEURAL_EN_US_VOICES + NEURAL_EN_GB_VOICES + NEURAL_EN_AU_VOICES + NEURAL_EN_ZA_VOICES + NEURAL_EN_NZ_VOICES + NEURAL_EN_IE_VOICES
MALE_VOICES = [item for item in ALL_VOICES if item.get('Gender') == 'Male']
FEMALE_VOICES = [item for item in ALL_VOICES if item.get('Gender') == 'Female']
MALE_NEURAL_VOICES = [item for item in ALL_NEURAL_VOICES if item.get('Gender') == 'Male']
FEMALE_NEURAL_VOICES = [item for item in ALL_NEURAL_VOICES if item.get('Gender') == 'Female']
difficulties = ["easy", "medium", "hard"]
mti_topics = [
"Education",
"Technology",
"Environment",
"Health and Fitness",
"Engineering",
"Work and Careers",
"Travel and Tourism",
"Culture and Traditions",
"Social Issues",
"Arts and Entertainment",
"Climate Change",
"Social Media",
"Sustainable Development",
"Health Care",
"Immigration",
"Artificial Intelligence",
"Consumerism",
"Online Shopping",
"Energy",
"Oil and Gas",
"Poverty and Inequality",
"Cultural Diversity",
"Democracy and Governance",
"Mental Health",
"Ethics and Morality",
"Population Growth",
"Science and Innovation",
"Poverty Alleviation",
"Cybersecurity and Privacy",
"Human Rights",
"Food and Agriculture",
"Cyberbullying and Online Safety",
"Linguistic Diversity",
"Urbanization",
"Artificial Intelligence in Education",
"Youth Empowerment",
"Disaster Management",
"Mental Health Stigma",
"Internet Censorship",
"Sustainable Fashion",
"Indigenous Rights",
"Water Scarcity",
"Social Entrepreneurship",
"Privacy in the Digital Age",
"Sustainable Transportation",
"Gender Equality",
"Automation and Job Displacement",
"Digital Divide",
"Education Inequality"
]
topics = [
"Art and Creativity",
"History of Ancient Civilizations",
"Environmental Conservation",
"Space Exploration",
"Artificial Intelligence",
"Climate Change",
"World Religions",
"The Human Brain",
"Renewable Energy",
"Cultural Diversity",
"Modern Technology Trends",
"Sustainable Agriculture",
"Natural Disasters",
"Cybersecurity",
"Philosophy of Ethics",
"Robotics",
"Health and Wellness",
"Literature and Classics",
"World Geography",
"Social Media Impact",
"Food Sustainability",
"Economics and Markets",
"Human Evolution",
"Political Systems",
"Mental Health Awareness",
"Quantum Physics",
"Biodiversity",
"Education Reform",
"Animal Rights",
"The Industrial Revolution",
"Future of Work",
"Film and Cinema",
"Genetic Engineering",
"Climate Policy",
"Space Travel",
"Renewable Energy Sources",
"Cultural Heritage Preservation",
"Modern Art Movements",
"Sustainable Transportation",
"The History of Medicine",
"Artificial Neural Networks",
"Climate Adaptation",
"Philosophy of Existence",
"Augmented Reality",
"Yoga and Meditation",
"Literary Genres",
"World Oceans",
"Social Networking",
"Sustainable Fashion",
"Prehistoric Era",
"Democracy and Governance",
"Postcolonial Literature",
"Geopolitics",
"Psychology and Behavior",
"Nanotechnology",
"Endangered Species",
"Education Technology",
"Renaissance Art",
"Renewable Energy Policy",
"Modern Architecture",
"Climate Resilience",
"Artificial Life",
"Fitness and Nutrition",
"Classic Literature Adaptations",
"Ethical Dilemmas",
"Internet of Things (IoT)",
"Meditation Practices",
"Literary Symbolism",
"Marine Conservation",
"Sustainable Tourism",
"Ancient Philosophy",
"Cold War Era",
"Behavioral Economics",
"Space Colonization",
"Clean Energy Initiatives",
"Cultural Exchange",
"Modern Sculpture",
"Climate Mitigation",
"Mindfulness",
"Literary Criticism",
"Wildlife Conservation",
"Renewable Energy Innovations",
"History of Mathematics",
"Human-Computer Interaction",
"Global Health",
"Cultural Appropriation",
"Traditional cuisine and culinary arts",
"Local music and dance traditions",
"History of the region and historical landmarks",
"Traditional crafts and artisanal skills",
"Wildlife and conservation efforts",
"Local sports and athletic competitions",
"Fashion trends and clothing styles",
"Education systems and advancements",
"Healthcare services and medical innovations",
"Family values and social dynamics",
"Travel destinations and tourist attractions",
"Environmental sustainability projects",
"Technological developments and innovations",
"Entrepreneurship and business ventures",
"Youth empowerment initiatives",
"Art exhibitions and cultural events",
"Philanthropy and community development projects"
]
two_people_scenarios = [
"Booking a table at a restaurant",
"Making a doctor's appointment",
"Asking for directions to a tourist attraction",
"Inquiring about public transportation options",
"Discussing weekend plans with a friend",
"Ordering food at a café",
"Renting a bicycle for a day",
"Arranging a meeting with a colleague",
"Talking to a real estate agent about renting an apartment",
"Discussing travel plans for an upcoming vacation",
"Checking the availability of a hotel room",
"Talking to a car rental service",
"Asking for recommendations at a library",
"Inquiring about opening hours at a museum",
"Discussing the weather forecast",
"Shopping for groceries",
"Renting a movie from a video store",
"Booking a flight ticket",
"Discussing a school assignment with a classmate",
"Making a reservation for a spa appointment",
"Talking to a customer service representative about a product issue",
"Discussing household chores with a family member",
"Planning a surprise party for a friend",
"Talking to a coworker about a project deadline",
"Inquiring about a gym membership",
"Discussing the menu options at a fast-food restaurant",
"Talking to a neighbor about a community event",
"Asking for help with computer problems",
"Discussing a recent sports game with a sports enthusiast",
"Talking to a pet store employee about buying a pet",
"Asking for information about a local farmer's market",
"Discussing the details of a home renovation project",
"Talking to a coworker about office supplies",
"Making plans for a family picnic",
"Inquiring about admission requirements at a university",
"Discussing the features of a new smartphone with a salesperson",
"Talking to a mechanic about car repairs",
"Making arrangements for a child's birthday party",
"Discussing a new diet plan with a nutritionist",
"Asking for information about a music concert",
"Talking to a hairdresser about getting a haircut",
"Inquiring about a language course at a language school",
"Discussing plans for a weekend camping trip",
"Talking to a bank teller about opening a new account",
"Ordering a drink at a coffee shop",
"Discussing a new book with a book club member",
"Talking to a librarian about library services",
"Asking for advice on finding a job",
"Discussing plans for a garden makeover with a landscaper",
"Talking to a travel agent about a cruise vacation",
"Inquiring about a fitness class at a gym",
"Ordering flowers for a special occasion",
"Discussing a new exercise routine with a personal trainer",
"Talking to a teacher about a child's progress in school",
"Asking for information about a local art exhibition",
"Discussing a home improvement project with a contractor",
"Talking to a babysitter about childcare arrangements",
"Making arrangements for a car service appointment",
"Inquiring about a photography workshop at a studio",
"Discussing plans for a family reunion with a relative",
"Talking to a tech support representative about computer issues",
"Asking for recommendations on pet grooming services",
"Discussing weekend plans with a significant other",
"Talking to a counselor about personal issues",
"Inquiring about a music lesson with a music teacher",
"Ordering a pizza for delivery",
"Making a reservation for a taxi",
"Discussing a new recipe with a chef",
"Talking to a fitness trainer about weight loss goals",
"Inquiring about a dance class at a dance studio",
"Ordering a meal at a food truck",
"Discussing plans for a weekend getaway with a partner",
"Talking to a florist about wedding flower arrangements",
"Asking for advice on home decorating",
"Discussing plans for a charity fundraiser event",
"Talking to a pet sitter about taking care of pets",
"Making arrangements for a spa day with a friend",
"Asking for recommendations on home improvement stores",
"Discussing weekend plans with a travel enthusiast",
"Talking to a car mechanic about car maintenance",
"Inquiring about a cooking class at a culinary school",
"Ordering a sandwich at a deli",
"Discussing plans for a family holiday party",
"Talking to a personal assistant about organizing tasks",
"Asking for information about a local theater production",
"Discussing a new DIY project with a home improvement expert",
"Talking to a wine expert about wine pairing",
"Making arrangements for a pet adoption",
"Asking for advice on planning a wedding"
]
social_monologue_contexts = [
"A guided tour of a historical museum",
"An introduction to a new city for tourists",
"An orientation session for new university students",
"A safety briefing for airline passengers",
"An explanation of the process of recycling",
"A lecture on the benefits of a healthy diet",
"A talk on the importance of time management",
"A monologue about wildlife conservation",
"An overview of local public transportation options",
"A presentation on the history of cinema",
"An introduction to the art of photography",
"A discussion about the effects of climate change",
"An overview of different types of cuisine",
"A lecture on the principles of financial planning",
"A monologue about sustainable energy sources",
"An explanation of the process of online shopping",
"A guided tour of a botanical garden",
"An introduction to a local wildlife sanctuary",
"A safety briefing for hikers in a national park",
"A talk on the benefits of physical exercise",
"A lecture on the principles of effective communication",
"A monologue about the impact of social media",
"An overview of the history of a famous landmark",
"An introduction to the world of fashion design",
"A discussion about the challenges of global poverty",
"An explanation of the process of organic farming",
"A presentation on the history of space exploration",
"An overview of traditional music from different cultures",
"A lecture on the principles of effective leadership",
"A monologue about the influence of technology",
"A guided tour of a famous archaeological site",
"An introduction to a local wildlife rehabilitation center",
"A safety briefing for visitors to a science museum",
"A talk on the benefits of learning a new language",
"A lecture on the principles of architectural design",
"A monologue about the impact of renewable energy",
"An explanation of the process of online banking",
"A presentation on the history of a famous art movement",
"An overview of traditional clothing from various regions",
"A lecture on the principles of sustainable agriculture",
"A discussion about the challenges of urban development",
"A monologue about the influence of social norms",
"A guided tour of a historical battlefield",
"An introduction to a local animal shelter",
"A safety briefing for participants in a charity run",
"A talk on the benefits of community involvement",
"A lecture on the principles of sustainable tourism",
"A monologue about the impact of alternative medicine",
"An explanation of the process of wildlife tracking",
"A presentation on the history of a famous inventor",
"An overview of traditional dance forms from different cultures",
"A lecture on the principles of ethical business practices",
"A discussion about the challenges of healthcare access",
"A monologue about the influence of cultural traditions",
"A guided tour of a famous lighthouse",
"An introduction to a local astronomy observatory",
"A safety briefing for participants in a team-building event",
"A talk on the benefits of volunteering",
"A lecture on the principles of wildlife protection",
"A monologue about the impact of space exploration",
"An explanation of the process of wildlife photography",
"A presentation on the history of a famous musician",
"An overview of traditional art forms from different cultures",
"A lecture on the principles of effective education",
"A discussion about the challenges of sustainable development",
"A monologue about the influence of cultural diversity",
"A guided tour of a famous national park",
"An introduction to a local marine conservation project",
"A safety briefing for participants in a hot air balloon ride",
"A talk on the benefits of cultural exchange programs",
"A lecture on the principles of wildlife conservation",
"A monologue about the impact of technological advancements",
"An explanation of the process of wildlife rehabilitation",
"A presentation on the history of a famous explorer",
"A lecture on the principles of effective marketing",
"A discussion about the challenges of environmental sustainability",
"A monologue about the influence of social entrepreneurship",
"A guided tour of a famous historical estate",
"An introduction to a local marine life research center",
"A safety briefing for participants in a zip-lining adventure",
"A talk on the benefits of cultural preservation",
"A lecture on the principles of wildlife ecology",
"A monologue about the impact of space technology",
"An explanation of the process of wildlife conservation",
"A presentation on the history of a famous scientist",
"An overview of traditional crafts and artisans from different cultures",
"A lecture on the principles of effective intercultural communication"
]
four_people_scenarios = [
"A university lecture on history",
"A physics class discussing Newton's laws",
"A medical school seminar on anatomy",
"A training session on computer programming",
"A business school lecture on marketing strategies",
"A chemistry lab experiment and discussion",
"A language class practicing conversational skills",
"A workshop on creative writing techniques",
"A high school math lesson on calculus",
"A training program for customer service representatives",
"A lecture on environmental science and sustainability",
"A psychology class exploring human behavior",
"A music theory class analyzing compositions",
"A nursing school simulation for patient care",
"A computer science class on algorithms",
"A workshop on graphic design principles",
"A law school lecture on constitutional law",
"A geology class studying rock formations",
"A vocational training program for electricians",
"A history seminar focusing on ancient civilizations",
"A biology class dissecting specimens",
"A financial literacy course for adults",
"A literature class discussing classic novels",
"A training session for emergency response teams",
"A sociology lecture on social inequality",
"An art class exploring different painting techniques",
"A medical school seminar on diagnosis",
"A programming bootcamp teaching web development",
"An economics class analyzing market trends",
"A chemistry lab experiment on chemical reactions",
"A language class practicing pronunciation",
"A workshop on public speaking skills",
"A high school physics lesson on electromagnetism",
"A training program for IT professionals",
"A lecture on climate change and its effects",
"A psychology class studying cognitive psychology",
"A music class composing original songs",
"A nursing school simulation for patient assessment",
"A computer science class on data structures",
"A workshop on 3D modeling and animation",
"A law school lecture on contract law",
"A geography class examining world maps",
"A vocational training program for plumbers",
"A history seminar discussing revolutions",
"A biology class exploring genetics",
"A financial literacy course for teens",
"A literature class analyzing poetry",
"A training session for public speaking coaches",
"A sociology lecture on cultural diversity",
"An art class creating sculptures",
"A medical school seminar on surgical techniques",
"A programming bootcamp teaching app development",
"An economics class on global trade policies",
"A chemistry lab experiment on chemical bonding",
"A language class discussing idiomatic expressions",
"A workshop on conflict resolution",
"A high school biology lesson on evolution",
"A training program for project managers",
"A lecture on renewable energy sources",
"A psychology class on abnormal psychology",
"A music class rehearsing for a performance",
"A nursing school simulation for emergency response",
"A computer science class on cybersecurity",
"A workshop on digital marketing strategies",
"A law school lecture on intellectual property",
"A geology class analyzing seismic activity",
"A vocational training program for carpenters",
"A history seminar on the Renaissance",
"A chemistry class synthesizing compounds",
"A financial literacy course for seniors",
"A literature class interpreting Shakespearean plays",
"A training session for negotiation skills",
"A sociology lecture on urbanization",
"An art class creating digital art",
"A medical school seminar on patient communication",
"A programming bootcamp teaching mobile app development",
"An economics class on fiscal policy",
"A physics lab experiment on electromagnetism",
"A language class on cultural immersion",
"A workshop on time management",
"A high school chemistry lesson on stoichiometry",
"A training program for HR professionals",
"A lecture on space exploration and astronomy",
"A psychology class on human development",
"A music class practicing for a recital",
"A nursing school simulation for triage",
"A computer science class on web development frameworks",
"A workshop on team-building exercises",
"A law school lecture on criminal law",
"A geography class studying world cultures",
"A vocational training program for HVAC technicians",
"A history seminar on ancient civilizations",
"A biology class examining ecosystems",
"A financial literacy course for entrepreneurs",
"A literature class analyzing modern literature",
"A training session for leadership skills",
"A sociology lecture on gender studies",
"An art class exploring multimedia art",
"A medical school seminar on patient diagnosis",
"A programming bootcamp teaching software architecture"
]
academic_subjects = [
"Astrophysics",
"Microbiology",
"Political Science",
"Environmental Science",
"Literature",
"Biochemistry",
"Sociology",
"Art History",
"Geology",
"Economics",
"Psychology",
"History of Architecture",
"Linguistics",
"Neurobiology",
"Anthropology",
"Quantum Mechanics",
"Urban Planning",
"Philosophy",
"Marine Biology",
"International Relations",
"Medieval History",
"Geophysics",
"Finance",
"Educational Psychology",
"Graphic Design",
"Paleontology",
"Macroeconomics",
"Cognitive Psychology",
"Renaissance Art",
"Archaeology",
"Microeconomics",
"Social Psychology",
"Contemporary Art",
"Meteorology",
"Political Philosophy",
"Space Exploration",
"Cognitive Science",
"Classical Music",
"Oceanography",
"Public Health",
"Gender Studies",
"Baroque Art",
"Volcanology",
"Business Ethics",
"Music Composition",
"Environmental Policy",
"Media Studies",
"Ancient History",
"Seismology",
"Marketing",
"Human Development",
"Modern Art",
"Astronomy",
"International Law",
"Developmental Psychology",
"Film Studies",
"American History",
"Soil Science",
"Entrepreneurship",
"Clinical Psychology",
"Contemporary Dance",
"Space Physics",
"Political Economy",
"Cognitive Neuroscience",
"20th Century Literature",
"Public Administration",
"European History",
"Atmospheric Science",
"Supply Chain Management",
"Social Work",
"Japanese Literature",
"Planetary Science",
"Labor Economics",
"Industrial-Organizational Psychology",
"French Philosophy",
"Biogeochemistry",
"Strategic Management",
"Educational Sociology",
"Postmodern Literature",
"Public Relations",
"Middle Eastern History",
"Oceanography",
"International Development",
"Human Resources Management",
"Educational Leadership",
"Russian Literature",
"Quantum Chemistry",
"Environmental Economics",
"Environmental Psychology",
"Ancient Philosophy",
"Immunology",
"Comparative Politics",
"Child Development",
"Fashion Design",
"Geological Engineering",
"Macroeconomic Policy",
"Media Psychology",
"Byzantine Art",
"Ecology",
"International Business"
]

View File

@@ -1,6 +0,0 @@
from enum import Enum
class ExamVariant(Enum):
FULL = "full"
PARTIAL = "partial"

Some files were not shown because too many files have changed in this diff Show More