Skip to content

Commit bf76942

Browse files
2024-09-06-xlm_roberta_base_finetuned_panx_german_ahmad_alismail_pipeline_en (#14401)
* Add model 2024-09-17-whisper_base_portuguese_zuazo_pipeline_pt * Add model 2024-09-16-code_human_ai_pipeline_en * Add model 2024-09-12-gal_ner_xlmr_2_pipeline_en * Add model 2024-09-16-kinyaroberta_large_kinte_finetuned_kinyarwanda_tweet_finetuned_kinyarwanda_sent3_en * Add model 2024-09-09-roberta_base_catalan_v2_ca * Add model 2024-09-13-sent_bert_base_nli_ct_en * Add model 2024-09-14-whisper_small_urdu_omar47_pipeline_ur * Add model 2024-09-12-xlm_roberta_base_finetuned_panx_german_r45289_pipeline_en * Add model 2024-09-17-xlm_roberta_base_operator_en * Add model 2024-09-12-bert_resume_classification_en * Add model 2024-09-07-distillber_squadv2_pipeline_en * Add model 2024-09-09-f_roberta_classifier2_pipeline_en * Add model 2024-09-08-self_harm_bert_en * Add model 2024-09-13-roberta_base_epoch_72_en * Add model 2024-09-17-hfa_poly_english_small_pipeline_en * Add model 2024-09-17-bert_base_uncased_ep_1_12_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_200_en * Add model 2024-09-16-bertweet_large_epoch6_batch4_lr2e_05_w0_005_pipeline_en * Add model 2024-09-13-gqa_roberta_german_legal_squad_part_augmented_2000_pipeline_de * Add model 2024-09-17-whisper_small_ndonga_pipeline_en * Add model 2024-09-17-malwhisper_v1_small_ml * Add model 2024-09-17-whisper_tiny_indonesian_evanarlian_id * Add model 2024-09-12-roberta_base_finetuned_squad_f_arnold_en * Add model 2024-09-16-opus_maltese_english_romanian_finetuned_romanian_tonga_tonga_islands_english_pontifexmaximus_pipeline_en * Add model 2024-09-17-whisper_small_marathi_steja_mr * Add model 2024-09-15-whisper_tiny_turkish_ckandemir_tr * Add model 2024-09-11-babyberta_aochildes_2_5m_wikipedia1_2_5m_with_masking_seed3_finetuned_squad_en * Add model 2024-09-09-q2d_origin_re_5_en * Add model 2024-09-15-kaz_roberta_base_ft_qa_turkish_maltese_tonga_tonga_islands_kaz_pipeline_kk * Add model 2024-09-16-nerd_nerd_random0_seed0_twitter_roberta_base_dec2020_en * Add model 2024-09-17-whisper_base_catalan_pipeline_ca * Add model 2024-09-17-bsc_bio_ehr_spanish_drugtemist_pipeline_es * Add model 2024-09-17-whisper_small_basque_cv16_1_eu * Add model 2024-09-17-whisper_small_basque_cv16_1_pipeline_eu * Add model 2024-09-16-dataequity_opus_maltese_spanish_english_pipeline_en * Add model 2024-09-11-transfer_course_distilroberta_base_mrpc_glue_nestor_mamani_en * Add model 2024-09-15-xlmr_romanian_english_all_shuffled_42_test1000_en * Add model 2024-09-15-distilbert_base_uncased_finetuned_emotion_deionk_en * Add model 2024-09-08-lab1_random_chenxin0903_pipeline_en * Add model 2024-09-07-xlmroberta_ner_xlm_roberta_base_finetuned_panx_ner_pipeline_it * Add model 2024-09-14-sent_hinglish_sbert_en * Add model 2024-09-09-opus_maltese_english_romanian_finetuned_english_tonga_tonga_islands_romanian_finetuned_english_tonga_tonga_islands_romanian_en * Add model 2024-09-09-opus_maltese_finetuned_english_spanish_en * Add model 2024-09-14-opus_maltese_italian_english_bds_en * Add model 2024-09-16-vanilla_dermat_es * Add model 2024-09-08-burmese_awesome_wnut_model_jaepax_pipeline_en * Add model 2024-09-09-xlm_roberta_base_finetuned_marc_begar_pipeline_en * Add model 2024-09-13-splade_pp_english_v2_en * Add model 2024-09-11-gpu1_pipeline_en * Add model 2024-09-13-sent_bert_medium_arabic_pipeline_ar * Add model 2024-09-14-marian_finetuned_kde4_english_tonga_tonga_islands_french_accelerate_huggingface_course_en * Add model 2024-09-12-finetuned_twitter_targeted_insult_roberta_en * Add model 2024-09-12-flat_model_pipeline_en * Add model 2024-09-11-xlm_roberta_base_finetuned_panx_french_sungkwangjoong_pipeline_en * Add model 2024-09-12-hate_hate_balance_random3_seed0_twitter_roberta_base_2021_124m_en * Add model 2024-09-15-distilbert_base_uncased_odm_zphr_0st2sd_ut72ut1_plprefix0stlarge2_simsp400_clean100_pipeline_en * Add model 2024-09-09-chai_reward_deberta_classifier_en * Add model 2024-09-09-xlm_roberta_base_finetuned_panx_german_italian_en * Add model 2024-09-17-your_model_name_en * Add model 2024-09-17-your_model_name_pipeline_en * Add model 2024-09-13-khipu_finetuned_amazon_reviews_multi_andrescastro_itm_en * Add model 2024-09-11-burmese_awesome_qa_model_ih138_en * Add model 2024-09-09-deberta_disaster_tweet_recognizer_pipeline_en * Add model 2024-09-09-opus_maltese_turkish_tonga_tonga_islands_english_pipeline_en * Add model 2024-09-17-bangla_asr_v7_pipeline_bn * Add model 2024-09-15-distilbert_base_uncased_odm_zphr_0st4sd_ut72ut5_plprefix0stlarge4_simsp100_clean300_en * Add model 2024-09-17-bert_base_squad_v1_1_portuguese_ibama_v0_220240904182329_en * Add model 2024-09-11-bsc_bio_ehr_spanish_vih_juicio_anam_urgen_en * Add model 2024-09-15-xlm_roberta_base_finetuned_panx_italian_scionk_pipeline_en * Add model 2024-09-12-finetuning_emotion_model_surajmahapatra_pipeline_en * Add model 2024-09-15-distilbert_finetuned_custom_pipeline_en * Add model 2024-09-13-msc_baseline_marian_pipeline_en * Add model 2024-09-17-whisper_small_bengali_crblp_pipeline_bn * Add model 2024-09-17-whisper_small_custom300_1e_5_va2000_pipeline_en * Add model 2024-09-15-takalane_ssw_roberta_pipeline_tn * Add model 2024-09-14-burmese_awesome_qa_model_sazara_pipeline_en * Add model 2024-09-16-definition_classification_v1_en * Add model 2024-09-17-whisper_base_portuguese_zuazo_pt * Add model 2024-09-10-burmese_nepal_bhasa_model_pipeline_en * Add model 2024-09-13-deberta_v3_large_survey_related_passage_consistency_rater_half_gpt4_pipeline_en * Add model 2024-09-10-xlm_roberta_base_finetuned_panx_german_french_ericklerouge123_pipeline_en * Add model 2024-09-17-whisper4_en * Add model 2024-09-17-workstation_whisper_base_finetune_teacher__babble_noise_mozilla_100_epochs_batch_4_pipeline_en * Add model 2024-09-17-hate_hate_balance_random3_seed2_bernice_pipeline_en * Add model 2024-09-08-roberta_base_climate_evidence_related_en * Add model 2024-09-05-question_answering_xlm_roberta_base_pipeline_en * Add model 2024-09-17-dipromats_subtask_1_base_train_en * Add model 2024-09-17-whisper_cli_dropout_small_oriya_pipeline_or * Add model 2024-09-13-distilbert_base_uncased_finetuned_emotion_xiumu1988_en * Add model 2024-09-07-hupd_distilroberta_base_pipeline_en * Add model 2024-09-17-whisper_small_yoruba_kaggle_train_pipeline_en * Add model 2024-09-09-distilbert_base_uncased_finetuned_emotion_saneryi_en * Add model 2024-09-09-arabic2_en * Add model 2024-09-13-culturebank_controversial_classifier_en * Add model 2024-09-14-opus_maltese_indonesian_english_jakarta_best_loss_bleu_en * Add model 2024-09-15-text_clf_model_v03_pipeline_en * Add model 2024-09-13-sent_bulbert_chitanka_model_pipeline_bg * Add model 2024-09-17-whisper_small_korean_yspeed_hi * Add model 2024-09-15-xlm_roberta_base_finetuned_panx_french_haesun_en * Add model 2024-09-16-english_tonga_tonga_islands_arabic_version3_en * Add model 2024-09-17-ep15_pipeline_en * Add model 2024-09-17-bert_base_uncased_ep_1_12_b_32_lr_8e_07_dp_0_5_swati_0_southern_sotho_false_fh_false_hs_200_pipeline_en * Add model 2024-09-15-distilbert_base_uncased_finetuned_emrqa_msquad_pipeline_en * Add model 2024-09-16-multiple_languages_coptic_english_norm_group_greekified_pipeline_en * Add model 2024-09-11-ternary_persian_sentiment_analysis_pipeline_en * Add model 2024-09-15-efficient_mlm_m0_15_801010_pipeline_en * Add model 2024-09-17-whisper_small_arnw_ar * Add model 2024-09-17-roberta_tagalog_base_ft_udpos213_serbian_pipeline_tl * Add model 2024-09-17-whisper_small_swedish_v4_sv * Add model 2024-09-17-whisper_small_hindi_mukund017_hi * Add model 2024-09-17-whisper_tiny_engmed_v2_pipeline_en * Add model 2024-09-14-whisper_small_cebtoeng_hi * Add model 2024-09-17-whisper_small_galician_zuazo_gl * Add model 2024-09-17-whisper_tiny_italian_6_it * Add model 2024-09-17-whisper_small_singlish_augmented_again_1200steps_en * Add model 2024-09-08-cross_encoder_russian_msmarco_ru * Add model 2024-09-17-whisper_tiny_minds_malikibrar_pipeline_en * Add model 2024-09-06-task_implicit_task__model_deberta__aug_method_ri_en * Add model 2024-09-11-xlm_roberta_base_mapa_coarse_ner_en * Add model 2024-09-13-bert_large_uncased_sst2_pipeline_en * Add model 2024-09-11-burmese_awesome_model_hannestt_pipeline_en * Add model 2024-09-15-custom_peft_whiper_small_korean_v3_en * Add model 2024-09-14-xlm_roberta_base_finetuned_panx_french_kata958_pipeline_en * Add model 2024-09-13-sms_spam_model_v1_2_en * Add model 2024-09-17-whisper_tiny_minds14_english_bayerasif_pipeline_en * Add model 2024-09-12-dken_en * Add model 2024-09-15-distilbert_base_uncased_finetuned_emotion_maydogdu_pipeline_en * Add model 2024-09-12-datosw_v1_2_pipeline_en * Add model 2024-09-12-marian_finetuned_kde4_english_tonga_tonga_islands_french_vonewman_pipeline_en * Add model 2024-09-16-emoji_emoji_random1_seed2_twitter_roberta_base_2021_124m_pipeline_en * Add model 2024-09-15-distilbert_sanskrit_saskta_glue_experiment_logit_kd_pretrain_stsb_pipeline_en * Add model 2024-09-16-opus_maltese_english_romanian_finetuned_english_tonga_tonga_islands_romanian_diegoalysson_en * Add model 2024-09-16-opus_maltese_semitic_languages_english_finetuned_npomo_english_15_epochs_pipeline_en * Add model 2024-09-09-output_sotseth_pipeline_en * Add model 2024-09-17-base_english_combined_v4_2_0_1_8_1e_05_dulcet_sweep_34_en * Add model 2024-09-17-whisper_base_chuvash_highlr_czech_cs * Add model 2024-09-11-fine_tune_spatial_pipeline_en * Add model 2024-09-12-xlm_roberta_base_finetuned_panx_all_hirosay_pipeline_en * Add model 2024-09-09-bge_base_financial_matryoshka_dpokhrel_en * Add model 2024-09-07-study_dummy_en * Add model 2024-09-17-whisper_small_ne2_1_en * Add model 2024-09-11-saved_model_body_pipeline_en * Add model 2024-09-15-mini_text_classification_finetune_model_pipeline_en * Add model 2024-09-15-furina_seed42_eng_kinyarwanda_amh_cross_0_0001_en * Add model 2024-09-16-opus_maltese_english_russian_finetuned_pipeline_en * Add model 2024-09-15-finetuning_sentiment_model_3000_samples_klumdedum_en * Add model 2024-09-10-distilbert_base_uncased_fillmask_finetuned_imdb_classifier_nlp_course_chapter7_section2_pipeline_en * Add model 2024-09-10-xlmr_finetuned_igbo_en * Add model 2024-09-16-focaltrain_pipeline_en * Add model 2024-09-17-tinymax_pipeline_en * Add model 2024-09-12-amharicqa_roberta_en * Add model 2024-09-14-xlm_roberta_base_finetuned_panx_german_ysige_pipeline_en * Add model 2024-09-15-whisper_small_urdu_howmannymore_pipeline_en * Add model 2024-09-14-finetuned_marianmtmodel_v4_specialfrom_ccmatrix77k_en * Add model 2024-09-14-model_for_french_pipeline_en * Add model 2024-09-09-best_model_yelp_polarity_64_42_en * Add model 2024-09-09-opus_maltese_english_german_bds_pipeline_en * Add model 2024-09-15-cuatr_distilbert_en * Add model 2024-09-12-xlm_roberta_base_finetuned_panx_french_jbreunig_pipeline_en * Add model 2024-09-10-whisper_small_tuned_en * Add model 2024-09-17-malasar_luke_dict_nan * Add model 2024-09-17-tiny_english_uva_chunked_with_synthetic_v2_4_1e_05_pipeline_en * Add model 2024-09-17-whisper_tiny_spanish_herme_pipeline_es * Add model 2024-09-10-incremental_semi_supervised_training_base_pipeline_en * Add model 2024-09-14-shami2english_en * Add model 2024-09-16-roberta_large_mnli_fld_pipeline_en * Add model 2024-09-10-roberta_base_bne_squad2_spanish_es * Add model 2024-09-15-fine_tuned_distilbert_isha31101999_pipeline_en * Add model 2024-09-17-whisper_base_chinese_cer_zh * Add model 2024-09-15-distilbert_base_uncased_finetuned_squad_wendywangwww_pipeline_en * Add model 2024-09-08-distilbert_base_uncased_finetuned_ner_cnguyenta_en * Add model 2024-09-17-whisper_small_hungarian_gyikesz_pipeline_hu * Add model 2024-09-13-all_roberta_large_v1_travel_3_16_5_en * Add model 2024-09-12-opus_maltese_english_spanish_finetuned_english_tonga_tonga_islands_spanish_tamil_5epochs_pipeline_en * Add model 2024-09-10-cross_all_bs160_allneg_finetuned_webnlg2020_relevance_en * Add model 2024-09-12-whisper_small_cv17_hungarian_hu * Add model 2024-09-17-whisper_tiny_julienchoukroun_en * Add model 2024-09-15-scenario_tcr_4_data_cardiffnlp_tweet_sentiment_multilingual_all_pipeline_xx * Add model 2024-09-14-finroberta_pipeline_en * Add model 2024-09-11-opus_maltese_english_bkm_final_60_pipeline_en * Add model 2024-09-16-lab1_finetuning_den_sota_en * Add model 2024-09-17-breeze_dsw_tiny_indonesian_id * Add model 2024-09-17-breeze_dsw_tiny_indonesian_pipeline_id * Add model 2024-09-13-roberta_base_epoch_60_en * Add model 2024-09-05-ae_detection_distilbert_pipeline_en * Add model 2024-09-17-whisper_small_train_v2_1_en * Add model 2024-09-11-opus_maltese_english_romanian_finetuned_english_tonga_tonga_islands_romanian_redpandaainlp_en * Add model 2024-09-17-whisper_small_train_v2_1_pipeline_en * Add model 2024-09-12-burmese_awesome_model_willw9758_pipeline_en * Add model 2024-09-08-recipes_trainer_n_sentences_per_recipe_3_sep_true_pipeline_en * Add model 2024-09-17-whisper_tiny_ga2en_v1_4_ga * Add model 2024-09-17-whisper_small_arnw_pipeline_ar * Add model 2024-09-16-turkish_medical_field_detection_8_pipeline_en * Add model 2024-09-16-roberta_large_finetuned_cola_cvapict_en * Add model 2024-09-17-whisper_small_hindi_mukund017_pipeline_hi * Add model 2024-09-17-whisper_small_divehi_agercas_dv * Add model 2024-09-17-roberta_large_metaie_super_academia_gpt4o_pipeline_en * Add model 2024-09-08-schemeclassifier_eng_en * Add model 2024-09-14-whisper_small_russian_1k_steps_ru * Add model 2024-09-14-luxembert_v2_en * Add model 2024-09-10-roberta_base_coqa_en * Add model 2024-09-15-grammar_classifier_pipeline_en * Add model 2024-09-14-whisper_medium_uzbek_extra_dataset_v2_en * Add model 2024-09-10-xlm_roberta_base_finetuned_panx_all_youngbreadho_en * Add model 2024-09-14-whisper_small3_italian_it * Add model 2024-09-12-first_qa_model_pipeline_en * Add model 2024-09-15-whisper_base_cer_gn * Add model 2024-09-15-stego_classifier_checkpoint_epoch_30_2024_07_26_12_23_45_pipeline_en * Add model 2024-09-09-opus_maltese_english_dutch_finetuned_combined_38_train_val_pipeline_en * Add model 2024-09-07-idiom_xlm_roberta_en * Add model 2024-09-06-dummy_model_fab7_pipeline_en * Add model 2024-09-09-chai_deberta_v3_base_reward_model_pipeline_en * Add model 2024-09-11-distilbert_sarcascm_classifier_en * Add model 2024-09-17-chinese_roberta_wwm_ext_2_0_8_ddp_en * Add model 2024-09-07-dummy_model_hanzhuo_pipeline_en * Add model 2024-09-10-training_v2_ru * Add model 2024-09-17-metaqa_en * Add model 2024-09-17-metaqa_pipeline_en * Add model 2024-09-16-opus_maltese_slavic_languages_english_finetuned_ukrainian_tonga_tonga_islands_english_pipeline_en * Add model 2024-09-17-predict_perception_xlmr_cause_object_en * Add model 2024-09-15-distilbert_finetuned_squadv2_mf212_en * Add model 2024-09-12-quran_whisper_tiny_v1_ar * Add model 2024-09-13-bert_vllm_gemma2b_7_pipeline_en * Add model 2024-09-11-all_mpnet_base_v2_2022_11_07_pipeline_en * Add model 2024-09-17-bert_base_german_cased_finetuned_squad_en * Add model 2024-09-14-opus_maltese_english_romanian_finetuned_english_tonga_tonga_islands_romanian_minzzi_en * Add model 2024-09-17-xlm_roberta_base_vtoc_100_pipeline_en * Add model 2024-09-11-phishing_email_detection_21_07_en * Add model 2024-09-09-lab1_random_reshphil_en * Add model 2024-09-09-maltese_coref_english_arabic_gender_exp_pipeline_en * Add model 2024-09-15-roberta_tuned_trial_13_13_2022_en * Add model 2024-09-15-models_mil00_pipeline_en * Add model 2024-09-08-roberta_base_squad2_finetuned_squad_katxtong_en * Add model 2024-09-08-sent_xlm_roberta_base_finetuned_burmese_dear_watson2_pipeline_en * Add model 2024-09-17-burmese_awesome_qa_model_nada_ghazouani_pipeline_en * Add model 2024-09-17-finetuned_bert_model_squad_datset_pipeline_en * Add model 2024-09-17-distilbert_base_uncased_squad2_p35_pipeline_en * Add model 2024-09-17-distilbert_base_uncased_finetuned_squad_hashemghanem_pipeline_en * Add model 2024-09-17-burmese_awesome_qa_model_lash_en * Add model 2024-09-17-burmese_awesome_qa_model_lash_pipeline_en * Add model 2024-09-12-italian_emotion_analyzer_it * Add model 2024-09-17-distilbert_base_uncased_finetuned_squad_d5716d28_serhii_korobchenko_pipeline_en * Add model 2024-09-17-distilbert_base_cased_distilled_squad_full_lora_merged_en * Add model 2024-09-11-multi_qa_mpnet_base_dot_v1_covidqa_search_75_25_2epoch_full_en * Add model 2024-09-11-klue_bert_base_sentiment_pipeline_ko * Add model 2024-09-17-burmese_qa_model_yadah_pipeline_en * Add model 2024-09-17-whisper_small_yoruba_kaggle_train_en * Add model 2024-09-15-tuf_albert_5e_en * Add model 2024-09-09-quberta_qu * Add model 2024-09-17-whisper_tiny_chinese_zhihcheng_pipeline_zh * Add model 2024-09-11-distilbert_base_uncased_finetuned_squad_test2_en --------- Co-authored-by: ahmedlone127 <[email protected]>
1 parent fc72501 commit bf76942

File tree

1,372 files changed

+110508
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,372 files changed

+110508
-0
lines changed
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
---
2+
layout: model
3+
title: Bulgarian bulbert_chitanka_model BertEmbeddings from mor40
4+
author: John Snow Labs
5+
name: bulbert_chitanka_model
6+
date: 2024-09-02
7+
tags: [bg, open_source, onnx, embeddings, bert]
8+
task: Embeddings
9+
language: bg
10+
edition: Spark NLP 5.5.0
11+
spark_version: 3.0
12+
supported: true
13+
engine: onnx
14+
annotator: BertEmbeddings
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`bulbert_chitanka_model` is a Bulgarian model originally trained by mor40.
23+
24+
{:.btn-box}
25+
<button class="button button-orange" disabled>Live Demo</button>
26+
<button class="button button-orange" disabled>Open in Colab</button>
27+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bulbert_chitanka_model_bg_5.5.0_3.0_1725318518639.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
28+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bulbert_chitanka_model_bg_5.5.0_3.0_1725318518639.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
29+
30+
## How to use
31+
32+
33+
34+
<div class="tabs-box" markdown="1">
35+
{% include programmingLanguageSelectScalaPythonNLU.html %}
36+
```python
37+
38+
documentAssembler = DocumentAssembler() \
39+
.setInputCol("text") \
40+
.setOutputCol("document")
41+
42+
tokenizer = Tokenizer() \
43+
.setInputCols("document") \
44+
.setOutputCol("token")
45+
46+
embeddings = BertEmbeddings.pretrained("bulbert_chitanka_model","bg") \
47+
.setInputCols(["document", "token"]) \
48+
.setOutputCol("embeddings")
49+
50+
pipeline = Pipeline().setStages([documentAssembler, tokenizer, embeddings])
51+
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
52+
pipelineModel = pipeline.fit(data)
53+
pipelineDF = pipelineModel.transform(data)
54+
55+
```
56+
```scala
57+
58+
val documentAssembler = new DocumentAssembler()
59+
.setInputCol("text")
60+
.setOutputCol("document")
61+
62+
val tokenizer = new Tokenizer()
63+
.setInputCols(Array("document"))
64+
.setOutputCol("token")
65+
66+
val embeddings = BertEmbeddings.pretrained("bulbert_chitanka_model","bg")
67+
.setInputCols(Array("document", "token"))
68+
.setOutputCol("embeddings")
69+
70+
val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings))
71+
val data = Seq("I love spark-nlp").toDF("text")
72+
val pipelineModel = pipeline.fit(data)
73+
val pipelineDF = pipelineModel.transform(data)
74+
75+
```
76+
</div>
77+
78+
{:.model-param}
79+
## Model Information
80+
81+
{:.table-model}
82+
|---|---|
83+
|Model Name:|bulbert_chitanka_model|
84+
|Compatibility:|Spark NLP 5.5.0+|
85+
|License:|Open Source|
86+
|Edition:|Official|
87+
|Input Labels:|[document, token]|
88+
|Output Labels:|[bert]|
89+
|Language:|bg|
90+
|Size:|306.1 MB|
91+
92+
## References
93+
94+
https://huggingface.co/mor40/BulBERT-chitanka-model
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
---
2+
layout: model
3+
title: English oorito MarianTransformer from LRJ1981
4+
author: John Snow Labs
5+
name: oorito
6+
date: 2024-09-03
7+
tags: [en, open_source, onnx, translation, marian]
8+
task: Translation
9+
language: en
10+
edition: Spark NLP 5.5.0
11+
spark_version: 3.0
12+
supported: true
13+
engine: onnx
14+
annotator: MarianTransformer
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
Pretrained MarianTransformer model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`oorito` is a English model originally trained by LRJ1981.
23+
24+
{:.btn-box}
25+
<button class="button button-orange" disabled>Live Demo</button>
26+
<button class="button button-orange" disabled>Open in Colab</button>
27+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/oorito_en_5.5.0_3.0_1725404166090.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
28+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/oorito_en_5.5.0_3.0_1725404166090.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
29+
30+
## How to use
31+
32+
33+
34+
<div class="tabs-box" markdown="1">
35+
{% include programmingLanguageSelectScalaPythonNLU.html %}
36+
```python
37+
38+
documentAssembler = DocumentAssembler() \
39+
.setInputCol("text") \
40+
.setOutputCol("document")
41+
42+
sentenceDL = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx") \
43+
.setInputCols(["document"]) \
44+
.setOutputCol("translation")
45+
46+
marian = MarianTransformer.pretrained("oorito","en") \
47+
.setInputCols(["sentence"]) \
48+
.setOutputCol("embeddings")
49+
50+
pipeline = Pipeline().setStages([documentAssembler, sentenceDL, marian])
51+
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
52+
pipelineModel = pipeline.fit(data)
53+
pipelineDF = pipelineModel.transform(data)
54+
55+
```
56+
```scala
57+
58+
val documentAssembler = new DocumentAssembler()
59+
.setInputCol("text")
60+
.setOutputCol("document")
61+
62+
val marian = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")
63+
.setInputCols(Array("document"))
64+
.setOutputCol("sentence")
65+
66+
val embeddings = MarianTransformer.pretrained("oorito","en")
67+
.setInputCols(Array("sentence"))
68+
.setOutputCol("translation")
69+
70+
val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDL, marian))
71+
val data = Seq("I love spark-nlp").toDF("text")
72+
val pipelineModel = pipeline.fit(data)
73+
val pipelineDF = pipelineModel.transform(data)
74+
75+
```
76+
</div>
77+
78+
{:.model-param}
79+
## Model Information
80+
81+
{:.table-model}
82+
|---|---|
83+
|Model Name:|oorito|
84+
|Compatibility:|Spark NLP 5.5.0+|
85+
|License:|Open Source|
86+
|Edition:|Official|
87+
|Input Labels:|[sentences]|
88+
|Output Labels:|[translation]|
89+
|Language:|en|
90+
|Size:|504.7 MB|
91+
92+
## References
93+
94+
https://huggingface.co/LRJ1981/OORito
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
---
2+
layout: model
3+
title: English deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4 DeBertaForSequenceClassification from domenicrosati
4+
author: John Snow Labs
5+
name: deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4
6+
date: 2024-09-04
7+
tags: [en, open_source, onnx, sequence_classification, deberta]
8+
task: Text Classification
9+
language: en
10+
edition: Spark NLP 5.5.0
11+
spark_version: 3.0
12+
supported: true
13+
engine: onnx
14+
annotator: DeBertaForSequenceClassification
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
Pretrained DeBertaForSequenceClassification model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4` is a English model originally trained by domenicrosati.
23+
24+
{:.btn-box}
25+
<button class="button button-orange" disabled>Live Demo</button>
26+
<button class="button button-orange" disabled>Open in Colab</button>
27+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4_en_5.5.0_3.0_1725440063827.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
28+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4_en_5.5.0_3.0_1725440063827.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
29+
30+
## How to use
31+
32+
33+
34+
<div class="tabs-box" markdown="1">
35+
{% include programmingLanguageSelectScalaPythonNLU.html %}
36+
```python
37+
38+
documentAssembler = DocumentAssembler() \
39+
.setInputCol('text') \
40+
.setOutputCol('document')
41+
42+
tokenizer = Tokenizer() \
43+
.setInputCols(['document']) \
44+
.setOutputCol('token')
45+
46+
sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4","en") \
47+
.setInputCols(["documents","token"]) \
48+
.setOutputCol("class")
49+
50+
pipeline = Pipeline().setStages([documentAssembler, tokenizer, sequenceClassifier])
51+
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
52+
pipelineModel = pipeline.fit(data)
53+
pipelineDF = pipelineModel.transform(data)
54+
55+
```
56+
```scala
57+
58+
val documentAssembler = new DocumentAssembler()
59+
.setInputCols("text")
60+
.setOutputCols("document")
61+
62+
val tokenizer = new Tokenizer()
63+
.setInputCols(Array("document"))
64+
.setOutputCol("token")
65+
66+
val sequenceClassifier = DeBertaForSequenceClassification.pretrained("deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4", "en")
67+
.setInputCols(Array("documents","token"))
68+
.setOutputCol("class")
69+
70+
val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, sequenceClassifier))
71+
val data = Seq("I love spark-nlp").toDS.toDF("text")
72+
val pipelineModel = pipeline.fit(data)
73+
val pipelineDF = pipelineModel.transform(data)
74+
75+
```
76+
</div>
77+
78+
{:.model-param}
79+
## Model Information
80+
81+
{:.table-model}
82+
|---|---|
83+
|Model Name:|deberta_v3_large_survey_nepal_bhasa_fact_main_passage_rater_gpt4|
84+
|Compatibility:|Spark NLP 5.5.0+|
85+
|License:|Open Source|
86+
|Edition:|Official|
87+
|Input Labels:|[document, token]|
88+
|Output Labels:|[class]|
89+
|Language:|en|
90+
|Size:|1.5 GB|
91+
92+
## References
93+
94+
https://huggingface.co/domenicrosati/deberta-v3-large-survey-new_fact_main_passage-rater-gpt4
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
---
2+
layout: model
3+
title: English dummy_model_umalakshmi07_pipeline pipeline CamemBertEmbeddings from Umalakshmi07
4+
author: John Snow Labs
5+
name: dummy_model_umalakshmi07_pipeline
6+
date: 2024-09-04
7+
tags: [en, open_source, pipeline, onnx]
8+
task: Embeddings
9+
language: en
10+
edition: Spark NLP 5.5.0
11+
spark_version: 3.0
12+
supported: true
13+
annotator: PipelineModel
14+
article_header:
15+
type: cover
16+
use_language_switcher: "Python-Scala-Java"
17+
---
18+
19+
## Description
20+
21+
Pretrained CamemBertEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`dummy_model_umalakshmi07_pipeline` is a English model originally trained by Umalakshmi07.
22+
23+
{:.btn-box}
24+
<button class="button button-orange" disabled>Live Demo</button>
25+
<button class="button button-orange" disabled>Open in Colab</button>
26+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/dummy_model_umalakshmi07_pipeline_en_5.5.0_3.0_1725409109729.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
27+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/dummy_model_umalakshmi07_pipeline_en_5.5.0_3.0_1725409109729.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
28+
29+
## How to use
30+
31+
32+
33+
<div class="tabs-box" markdown="1">
34+
{% include programmingLanguageSelectScalaPythonNLU.html %}
35+
```python
36+
37+
pipeline = PretrainedPipeline("dummy_model_umalakshmi07_pipeline", lang = "en")
38+
annotations = pipeline.transform(df)
39+
40+
```
41+
```scala
42+
43+
val pipeline = new PretrainedPipeline("dummy_model_umalakshmi07_pipeline", lang = "en")
44+
val annotations = pipeline.transform(df)
45+
46+
```
47+
</div>
48+
49+
{:.model-param}
50+
## Model Information
51+
52+
{:.table-model}
53+
|---|---|
54+
|Model Name:|dummy_model_umalakshmi07_pipeline|
55+
|Type:|pipeline|
56+
|Compatibility:|Spark NLP 5.5.0+|
57+
|License:|Open Source|
58+
|Edition:|Official|
59+
|Language:|en|
60+
|Size:|264.0 MB|
61+
62+
## References
63+
64+
https://huggingface.co/Umalakshmi07/dummy-model
65+
66+
## Included Models
67+
68+
- DocumentAssembler
69+
- TokenizerModel
70+
- CamemBertEmbeddings

0 commit comments

Comments
 (0)