Amazon SageMaker built-in LightGBM now offers distributed training using Dask
AWS Machine Learning
JANUARY 30, 2023
The supported data format can be either CSV or Parquet. Extensive benchmarking experiments on three publicly available datasets with various settings are conducted to validate its performance. Distributed training is a technique that allows for the parallel processing of large amounts of data across multiple machines or devices.
Let's personalize your content