Large Language Model Settings
Each Large Language Model (LLM) has some settings which can be tweaked through their API to get better responses which are reliable and desirable while adding creativity or making sure they are deterministic, to get the best settings for the parameters require some tuning and experimentation this is used used along with Prompt Engineering to fine tune and get a desired output
Temperature
this value plays a crucial role in the output of the response, lower values provides us a more deterministic answer whilst higher values gives us more creative and deterministic answers.
We are increasing the weights of other possible tokens, making it possible to give more diverse responses
Top P
Use either temperature or top_k
It is recommended to use only one of the two - temperature and top_p and not both
This is an alternative setting to alter the output responses of the LLM, this parameter allows the LLM to look at more possible words allowing the answer to be more diverse and creative, a lower top_p will result in answers in which the LLM is more confident in, which will result in answers that are more deterministic and factual
Title
If you are using temperature and not getting the desired results, try experimenting with top_p which might give you better results
Max Length
This parameter allows you to control the final output token length, helps us control costs and also reduce any irrelevant responses.
Stop Sequences
using this parameter we can stop the model from generating any further token, it is a very niche parameter and only used in certain use cases. eg. code generation
Frequency Penalty
this parameter applies a penalty if the model is using certain token or certain words frequently. it is used to prevent the model from generating repetitive words
Presence Penalty
this parameter is very similar to frequency penalty but the penalty is applied of for all the tokens, as the penalty is applied equally on all the tokens, it helps in generating more diverse and creative answers and response, the affect increases as we increase the value of this parameter.
Use Either Frequency Penalty or Presence Penalty
It is recommended to use only one of the two - frequency penalty and presence penalty
Results vary depending on the LLM Model
Make sure to do thorough testing as the results will vary depending on the LLM model
References