[1] |
BOTTOU L. Large-Scale Machine Learning with Stochastic Gradient Descent[C]// Springer. Proceedings of COMPSTAT' 2010. Berlin:Springer, 2010: 177-186.
|
[2] |
DUCHI J, HAZAN E, SINGER Y. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization[J]. Journal of Machine Learning research, 2011, 12(7): 257-269.
|
[3] |
LIN Jiadong, SONG Chuanbiao, HE Kun, et al. Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks[EB/OL]. (2020-02-03)[2022-10-05]. https://www.xueshufan.com/publication/2976752987.
|
[4] |
ZEILER M D. Adadelta: An Adaptive Learning Rate Method[EB/OL]. (2012-12-22)[2022-10-05]. http://export.arxiv.org/pdf/1212.5701.
|
[5] |
WANG Bao, NGUYEN T, SUN Tao, et al. Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent[EB/OL]. (2020-04-26)[2022-10-05]. https://www.xueshufan.com/publication/3007093918.
|
[6] |
ZOU Fangyu, SHEN Li, JIE Zequn, et al. A Sufficient Condition for Convergences of Adam and RMSprop[C]// IEEE. IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). New York:IEEE, 2018: 11127-11135.
|
[7] |
KINGMA D P, BA J. Adam: A Method for Stochastic Optimizationt[EB/OL]. (2017-01-30)[2022-10-05]. https://arxiv.org/pdf/1412.6980.
|
[8] |
TAYLOR G, BURMERISTER R, XU Zheng, et al. Training Neural Networks without Gradients: A Scalable Admm Approach[C]// ACM. International Conference on Machine Learning. New York: ACM, 2016: 2722-2731.
|
[9] |
WANG Junxiang, YU Fuxun, CHEN Xiang, et al. Admm for Efficient Deep Learning with Global Convergence[C]// ACM. 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2019: 111-119.
|
[10] |
HONG Mingyi, LUO Zhiquan, RAZAVIYAYN M. Convergence Analysis of Alternating Direction Method of Multipliers for a Family of Nonconvex Problems[C]// IEEE. International Conference on Acoustics, Speech, and Signal Processing (ICASSP). New York:IEEE, 2015: 337-364.
|
[11] |
GOLDSTEIN T, O'DONOGHUE B, SETZER S, et al. Fast Alternating Direction Optimization Methods[J]. SIAM Journal on Imaging Sciences, 2014, 7(3): 1588-1623.
doi: 10.1137/120896219
URL
|
[12] |
ZHANG Guoqiang, HEUADENS R. Bi-Alternating Direction Method of Multipliers over Graphs[C]// IEEE. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. New York: IEEE, 2013: 3317-3321.
|
[13] |
ZHANG Guoqiang, HEUADENS R, KLEIJN W B. On the Convergence Rate of the Bi-Alternating Direction Method of Multipliers[C]// IEEE. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). New York:IEEE, 2014: 3869-3873.
|
[14] |
BECK A, TEBOULLE M. A Fast Iterative ShrinkageTthresholding Algorithm for Linear Inverse Problems[J]. SIAM Journal on Imaging Sciences, 2009, 2(1): 183-202.
doi: 10.1137/080716542
URL
|
[15] |
SCHEINBERG K, GOLDFARB D, BAI Xi. Fast First-Order Methods for Composite Convex Optimization with Backtracking[J]. Foundations of Computational Mathematics, 2014, 14(3): 389-417.
doi: 10.1007/s10208-014-9189-9
URL
|
[16] |
Wang Huahua, Banerjee A. Bregman Alternating Direction Method of Multipliers[EB/OL]. (2014-07-08) [2022-10-05]. https://doi.org/10.48550/arXiv.1306.3203.
doi: https://doi.org/10.48550/arXiv.1306.3203
|
[17] |
ZHOU Xingyu. On the Fenchel Duality Between Strong Convexity and Lipschitz Continuous Gradient[EB/OL]. (2018-03-17)[2022-10-05]. https://www.xueshufan.com/publication/2793948820.
|
[18] |
HUTZENTHALER M, JENTZEN A, KRUSE T, et al. Multilevel Picard Approximations for High-Dimensional Semilinear Second-Order PDEs with Lipschitz Nonlinearitiest[EB/OL]. (2018-03-17)[2022-10-05].https://arxiv.org/abs/2009.02484.
|
[19] |
BOYD S, PARIKH N, CHU E, et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers[J]. Foundations and Trends® in Machine learning, 2011, 3(1): 1-122.
doi: 10.1561/2200000016
URL
|
[20] |
DOMBI J, JONAS T. The Generalized Sigmoid Function and its Connection with Logical Operators[J]. International Journal of Approximate Reasoning, 2022, 143(4): 121-138.
doi: 10.1016/j.ijar.2022.01.006
URL
|
[21] |
NAYEF B H, ABDULLAH S N H S, SULAIMAN R, et al. Optimized Leaky ReLU for Handwritten Arabic Character Recognition Using Convolution Neural Networks[J]. Multimedia Tools and Applications, 2022, 81(2): 2065-2094.
doi: 10.1007/s11042-021-11593-6
URL
|
[22] |
LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-Based Learning Applied to Document Recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
doi: 10.1109/5.726791
URL
|
[23] |
XIAO Han, RASUL K, VOLLGRAF R. Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms[EB/OL]. (2017-09-15)[2022-10-05]. https://arxiv.org/pdf/1708.07747.
|
[24] |
GOLDSBOROUGH P. A Tour of Tensorflow[EB/OL]. (2016-10-01)[2022-10-05]. https://arxiv.org/pdf/1610.01178.
|
[25] |
PASZKE A, GROSS S, MASSA F, et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library[J]. Advances in Neural Information Processing Systems, 2019, 32: 1-12.
|