# NovelAI

# API Key

To get your NovelAI API key, follow these steps:

Select the gear icon at the top of the left sidebar.
Select "Account" under "User Settings".

image
Select "Get Persistent API Token".

image
Select the copy icon to copy your NovelAI API token to the clipboard.

image

# Models

You should use Kayra.

Clio is not a bad model, but not as powerful as Kayra. Clio's speed advantage is insignificant. On NovelAI's tablet and scroll tiers, Clio does have a larger context size than Kayra, but trading that off against better coherence/prose quality from Kayra is unlikely to be worth it.

Krake and Euterpe aren't recommended - NovelAI even refers to them as legacy models.

# Settings

The files with the settings are here (SillyTavern\public\NovelAI Settings). You can also manually add your own settings files.

# Response Length

How much text you want to generate per message. Note that NovelAI has a limit of 150 tokens per response.

# Context Size

How many tokens of the chat are kept in the context at any given time. How large the maximum context size you can use depends on the model and your subscription tier:

Kayra (tablet) - 3072 tokens
Kayra (scroll) - 6144 tokens
Kayra (opus) and Clio (all tiers) - 8192 tokens

# Temperature

Lower value - the answers are more logical, but less creative.
Higher value - the answers are more creative, but less logical.

# Repetition penalty

Higher values make the output less repetitive. If the character is fixated on something or repeats the same phrase, then increasing this parameter will (likely) fix it. It is not recommended to increase this parameter too much as it may break the outputs.

# Repetition penalty range

How many tokens from the last generated token will be considered repeated if they appear in the output.

# Preamble

Text that is inserted right above the chat to modify the writing style. The recommended format is a list of short tags, like "[ Style: chat, detailed, sensory ]".

# Top P

Limits the token pool to however many tokens it takes for their probabilities to add up to p. A lower number is more consistent, but less creative.

# Top K

Limits the token pool to the k most likely tokens. A lower number is more consistent, but less creative.