上QQ阅读APP看书，第一时间看更新

Don't Get Lost in Techniques – Focus on Optimizing Your Solutions

No matter how much we know, the key point remains being able to deliver an artificial intelligence solution or not. Implementing a machine learning (ML) or deep learning (DL) program remains difficult and will become more complex as technology progresses at exponential rates. This will be shown in Chapter 15, Cognitive NLP Chatbots, on quantum computing, which may revolutionize computing with its mind-blowing concepts. There is no such thing as a simple or easy way to design AI systems. A system is either efficient or not, beyond being either easy or not. Either the designed AI solution provides real-life practical uses or it builds up into a program that fails to work in various environments beyond the scope of training sets.

This chapter doesn't deal with how to build the most difficult system possible to show our knowledge and experience. It faces the hard truth of real-life delivery and ways to overcome obstacles. Without data, your project will never take off. Even an unsupervised ML program requires unlabeled data in some form or other.

Beyond understanding convolutional neural networks (CNN) that can recognize Modified National Institute of Standards and Technology (MNIST) training sets of handwritten images as described in Chapter 10, Applying Biomimicking to Artificial Intelligence, efficiency comes first. In a corporate environment, you will often have to deal with designing datasets and obtaining data. Doubts follow quickly. Are the datasets for a given project reliable? How can this be measured? How many features need to be implemented? What about optimizing the cost function and training functions?

This chapter provides the methodology and tools needed to overcome everyday artificial intelligence project obstacles. k-means clustering, a key ML algorithm, will be applied to the increasing need for warehouse intelligence (Amazon and all online product-selling sites). Quickly showing the solution to a problem will keep a project alive. Focusing on the solution, and not the techniques, will get you there.

The following topics will be covered in this chapter:

Designing datasets
The design matrix
Dimensionality reduction
Determining the volume of a training set
k-means clustering
Unsupervised learning
Data conditioning management for the training dataset
Lloyd's algorithm
Building a Python k-means clustering program from scratch
Hyperparameters
Test dataset and prediction
Presenting the solution to a team