Apache Spark
From MaRDI portal
Software:40132
swMATH28418MaRDI QIDQ40132FDOQ40132
Author name not available (Why is that?)
Source code repository: https://github.com/apache/spark
Cited In (72)
- Property-Based Testing for Spark Streaming
- Modern Datalog Engines
- Regression Neural Networks with a Highly Robust Loss Function
- Computation Against a Neighbour: Addressing Large-Scale Distribution and Adaptivity with Functional Programming and Scala
- Temporal concatenation for Markov decision processes
- Performance Comparison of Machine Learning Platforms
- Combining Interval Time Series Forecasts. A First Step in a Long Way (Research Agenda)
- A Detailed Study of the Distributed Rough Set Based Locality Sensitive Hashing Feature Selection Technique
- Privacy-preserving computation in cyber-physical-social systems: a survey of the state-of-the-art and perspectives
- Translating Scala Programs to Isabelle/HOL
- A survey on the distributed computing stack
- Love and Hate During Political Campaigns in Social Networks
- Triclustering in Big Data Setting
- Fregel: a functional domain-specific language for vertex-centric large-scale graph processing
- A Novel Hybrid Sampling Algorithm for Solving Class Imbalance Problem in Big Data
- Parallel Weighted Random Sampling
- Scheduling Parallel-Task Jobs Subject to Packing and Placement Constraints
- Parametric Gaussian process regression for big data
- GSA for machine learning problems: a comprehensive overview
- Genetic programming \(+\) proof search \(=\) automatic improvement
- Traditional and context-specific spam detection in low resource settings
- A new accelerated proximal boosting machine with convergence rate \(O(1/t^2)\)
- A semi-parallel framework for greedy information-theoretic feature selection
- Distributed cooperative learning over time-varying random networks using a gossip-based communication protocol
- Elephant against Goliath: performance of big data versus high-performance computing DBSCAN clustering implementations
- An intuitive fuzzy approach for evaluating financial resiliency of supply chain
- Title not available (Why is that?)
- MLlib: machine learning in Apache Spark
- An effective and efficient MapReduce algorithm for computing BFS-based traversals of large-scale RDF graphs
- From distributed coordination to field calculus and aggregate computing
- Equivalence classes and conditional hardness in massively parallel computations
- \(k\)-means, Ward and probabilistic distance-based clustering methods with contiguity constraint
- Spark solutions for discovering fuzzy association rules in big data
- Distribution Policies for Datalog.
- Full likelihood inference from the site frequency spectrum based on the optimal tree resolution
- MLP-ANN-based execution time prediction model and assessment of input parameters through structural modeling
- KATZ centrality with biogeography-based optimization for influence maximization problem
- Novel data-driven method for non-probabilistic uncertainty analysis of engineering structures based on ellipsoid model
- A three-way cluster ensemble approach for large-scale data
- Scaling up Bayesian variational inference using distributed computing clusters
- GEODIS: towards the optimization of data locality-aware job scheduling in geo-distributed data centers
- Statistical challenges of big brain network data
- Computational fluid dynamics simulation based on hadoop ecosystem and heterogeneous computing
- MuLOT: multi-level optimization of the canonical polyadic tensor decomposition at large-scale
- A cloud computing-based intelligent forecasting method for cross-border e-commerce logistics costs
- Iterative selection of categorical variables for log data anomaly detection
- Widening: using parallel resources to improve model quality
- Semantic Foundations for Deterministic Dataflow and Stream Processing
- Large scale implementations for Twitter sentiment classification
- A distributed and incremental SVD algorithm for agglomerative data analysis on large networks
- Boosting evolutionary algorithm configuration
- Title not available (Why is that?)
- Optimal control in dynamic food supply chains using big data
- User-Defined Tensor Data Analysis
- A novel interval-valued data driven type-2 possibilistic local information c-means clustering for land cover classification
- Big data time series forecasting based on pattern sequence similarity and its application to the electricity demand
- Using machine learning with PySpark and MLib for solving a binary classification problem: case of searching for exotic particles
- A distributed \(K\)-means segmentation algorithm applied to \textit{Lobesia botrana} recognition
- Least-Square Approximation for a Distributed System
- Evidential instance selection for \(K\)-nearest neighbor classification of big data
- Distribution policies for Datalog
- Mining maximal frequent patterns in transactional databases and dynamic data streams: a Spark-based approach
- Randomized Gradient Boosting Machine
- A distributed ensemble of relevance vector machines for large-scale data sets on Spark
- Minimum distance histograms with universal performance guarantees
- Big data: from collection to visualization
- An experience in using machine learning for short-term predictions in smart transportation systems
- Link prediction in multiplex networks using intralayer probabilistic distance and interlayer co-evolving factors
- A greedy feature selection algorithm for big data of high dimensionality
- Title not available (Why is that?)
- A safe reinforced feature screening strategy for Lasso based on feasible solutions
- A Bayesian perspective of statistical machine learning for big data
This page was built for software: Apache Spark