Preparation
• Log in to cometapi. Click "ADD API key" in the API keys to get your token key: sk-xxxxx and baseurl: https://api.cometapi.comPrerequisites#
Access to a CometAPI account. Generate an API key from the API Keys page. Jupyter Notebook or a Python environment for running the examples (optional, but recommended for interactive testing).
Step 1: Install LiteLLM#
Install the LiteLLM library using pip. This is a one-time setup.Step 2: Set Up Your API Key#
You need Paste the key you just got from CometAPI to authenticate requests.1.
Set it as an environment variable (recommended for security) or pass it directly in your code.
Here's an example in Python:Note: Using the environment variable is safer as it avoids hardcoding sensitive information in your scripts.Step 3: Make a Basic Completion Call#
Use LiteLLM's completion
function to send messages to a CometAPI model. You can specify models like cometapi/gpt-5
or cometapi/gpt-4o
.Method 1: Use the environment variable for the API key (recommended).
Method 2: Pass the API key explicitly.
The code will print the model's responses, e.g.:I'm doing well, thank you! How about you?
Hello! I'm doing great, thanks for asking. How can I assist you today?
This sends a simple user message and retrieves the model's completion. You can customize the messages
array for more complex conversations (e.g., add system prompts or multi-turn chats).Step 4: Asynchronous and Streaming Calls#
For non-blocking or real-time applications, use LiteLLM's acompletion
function for asynchronous calls. This is useful with Python's asyncio
for handling concurrency.You can also enable streaming to receive responses in chunks (e.g., for live chat interfaces).acompletion
is the asynchronous version of completion
.
stream=True
enables streaming, where the response is yielded in real-time chunks.
Use asyncio
to run the function (e.g., in a Jupyter Notebook with await
or via asyncio.run()
in scripts).
If an error occurs, it's caught and printed for debugging.
You'll see the response object and individual chunks printed, e.g.:Testing asynchronous completion with streaming
Response object: <async_generator object acompletion at 0x...>
Chunk: {'choices': [{'delta': {'content': 'Hello'}, 'index': 0}]}
Chunk: {'choices': [{'delta': {'content': '!'}, 'index': 0}]}
... (full response streamed in parts)
Additional Tips#
Supported Models: CometAPI models follow the format cometapi/<model-name>
, e.g., cometapi/gpt-5
, cometapi/gpt-4o
, cometapi/chatgpt-4o-latest
. Check the CometAPI documentation for the latest models.
Error Handling: Always wrap calls in try-except blocks to handle issues like invalid keys or network errors.
Advanced Features: LiteLLM supports parameters like temperature
, max_tokens
, and top_p
for fine-tuning responses. Add them to the completion
or acompletion
calls, e.g., completion(..., temperature=0.7)
.
Security: Never commit your API key to version control. Use environment variables or secret managers.
Troubleshooting: If you encounter issues, ensure your API key is valid and check LiteLLM's logs. For more details, refer to the LiteLLM Documentation or CometAPI Doc. Rate Limits and Costs: Monitor your API usage in the CometAPI console.
Modified at 2025-08-20 11:48:57