Databricks’ Dolly v2 is also available i these other models sizes: Please refer to the dolly GitHub repo for tips o
ruig iferece for various GPU cofiguratios. To use the model with the The istructio followig pipelie ca be loaded usig the The Dolly model family is uder active developmet, ad so ay list of shortcomigs is ulikely to be exhaustive, but we iclude kow limitatios ad misfires here as a meas to documet ad share our prelimiary fidigs with the commuity. Like all laguage models, Databricks is committed to ogoig research ad developmet efforts to develop helpful, hoest ad harmless AI techologies that
maximize the potetial of all idividuals ad orgaizatios. Below you'll fid various models bechmark performace o the EleutherAI LLM Evaluatio Haress;
model results are sorted by geometric mea to produce a itelligible orderig. As outlied above, these results demostrate that dolly-v2-7b Model Card
Summary
dolly-v2-7b
, a istructio-followig large laguage model traied o the Databricks machie learig platform
that is licesed for commercial use. Based o pythia-6.9b
, Dolly is traied o ~15k istructio/respose fie tuig records
databricks-dolly-15k
geerated
by Databricks employees i capability domais from the IstructGPT paper, icludig braistormig, classificatio, closed QA, geeratio,
iformatio extractio, ope QA ad summarizatio. dolly-v2-7b
is ot a state-of-the-art model, but does exhibit surprisigly
high quality istructio followig behavior ot characteristic of the foudatio model o which it is based.
pythia-12b
pythia-2.8b
Model Overview
dolly-v2-7b
is a 6.9 billio parameter causal laguage model created by Databricks that is derived from
EleutherAI’s Pythia-6.9b ad fie-tued
o a ~15K record istructio corpus geerated by Databricks employees ad released uder a permissive licese (CC-BY-SA)Usage
trasformers
library o a machie with GPUs, first make sure you have the trasformers
ad accelerate
libraries istalled.
I a Databricks otebook you could ru:%pip istall accelerate>=0.12.0 trasformers[torch]==4.25.1
pipelie
fuctio as show below. This loads a custom IstructioTextGeeratioPipelie
foud i the model repo here, which is why trust_remote_code=True
is required.
Icludig torch_dtype=torch.bfloat16
is geerally recommeded if this type is supported i order to reduce memory usage. It does ot appear to impact output quality.
It is also fie to remove it if there is sufficiet memory.示例代码
from modelscope.pipelies import pipelie
from modelscope.utils.costat import Tasks
if __ame__ == '__mai__':
model = "AI-ModelScope/dolly-v2-7b"
pipe = pipelie(Tasks.text_geeratio, model=model, model_revisio='v1.0.1', device='cuda:0',max_legth=40)
istructio = "Explai to me the differece betwee uclear fissio ad fusio."
output = pipe(istructio)
prit(output)
Kow Limitatios
Performace Limitatios
dolly-v2-7b
is ot a state-of-the-art geerative laguage model
I particular, dolly-v2-7b
struggles with: sytactically complex prompts, programmig problems, mathematical operatios, factual errors,
dates ad times, ope-eded questio aswerig, halluciatio, eumeratig lists of specific legth, stylistic mimicry, havig a sese of humor, etc.
Moreover, we fid that dolly-v2-7b
does ot have some capabilities, such as well-formatted letter writig, preset i the origial model. Dataset Limitatios
dolly-v2-7b
reflects the cotet ad limitatios of its traiig corpuses.
databricks-dolly-15k
dolly-v2-7b
is istructio tued represets atural laguage istructios geerated
by Databricks employees durig a period spaig March ad April 2023 ad icludes passages from Wikipedia as refereces passages
for istructio categories like closed QA ad summarizatio. To our kowledge it does ot cotai obsceity, itellectual property or
persoally idetifyig iformatio about o-public figures, but it may cotai typos ad factual errors.
The dataset may also reflect biases foud i Wikipedia. Fially, the dataset likely reflects
the iterests ad sematic choices of Databricks employees, a demographic which is ot represetative of the global populatio at large.Bechmark Metrics
dolly-v2-7b
is ot state of the art,
ad i fact uderperforms dolly-v1-6b
i some evaluatio bechmarks. We believe this owes to the compositio ad size of the uderlyig fie tuig datasets,
but a robust statemet as to the sources of these variatios requires further study.
model
opebookqa
arc_easy
wiograde
hellaswag
arc_challege
piqa
boolq
gmea
EleutherAI/pythia-2.8b
0.348
0.585859
0.589582
0.591217
0.323379
0.73395
0.638226
0.523431
EleutherAI/pythia-6.9b
0.368
0.604798
0.608524
0.631548
0.343857
0.761153
0.6263
0.543567
databricks/dolly-v2-3b
0.384
0.611532
0.589582
0.650767
0.370307
0.742655
0.575535
0.544886
EleutherAI/pythia-12b
0.364
0.627104
0.636148
0.668094
0.346416
0.760065
0.673394
0.559676
EleutherAI/gpt-j-6B
0.382
0.621633
0.651144
0.662617
0.363481
0.761153
0.655963
0.565936
databricks/dolly-v2-12b
0.408
0.63931
0.616417
0.707927
0.388225
0.757889
0.568196
0.56781
databricks/dolly-v2-7b
0.392
0.633838
0.607735
0.686517
0.406997
0.750816
0.644037
0.573487
databricks/dolly-v1-6b
0.41
0.62963
0.643252
0.676758
0.384812
0.773667
0.687768
0.583431
EleutherAI/gpt-eox-20b
0.402
0.683923
0.656669
0.7142
0.408703
0.784004
0.695413
0.602236
Happy Hackig!
点击空白处退出提示
评论