machine-learning – Make Me Engineer

TimeDistributed(Dense) vs Dense in Keras – Same number of parameters

June 14, 2023 by Tarik

TimeDistributedDense applies a same dense to every time step during GRU/LSTM Cell unrolling. So the error function will be between predicted label sequence and the actual label sequence. (Which is normally the requirement for sequence to sequence labeling problems). However, with return_sequences=False, Dense layer is applied only once at the last cell. This is normally … Read more

Can Keras deal with input images with different size?

June 12, 2023 by Tarik

Yes. Just change your input shape to shape=(n_channels, None, None). Where n_channels is the number of channels in your input image. I’m using Theano backend though, so if you are using tensorflow you might have to change it to (None,None,n_channels) You should use: input_shape=(1, None, None) None in a shape denotes a variable dimension. Note … Read more

Pattern recognition in time series [closed]

June 9, 2023 by Tarik

Here is a sample result from a small project I did to partition ecg data. My approach was a “switching autoregressive HMM” (google this if you haven’t heard of it) where each datapoint is predicted from the previous datapoint using a Bayesian regression model. I created 81 hidden states: a junk state to capture data … Read more

Calculate the output size in convolution layer [closed]

May 22, 2023 by Tarik

you can use this formula [(W−K+2P)/S]+1. W is the input volume – in your case 128 K is the Kernel size – in your case 5 P is the padding – in your case 0 i believe S is the stride – which you have not provided. So, we input into the formula: Output_Shape = … Read more

Should Feature Selection be done before Train-Test Split or after?

May 14, 2023 by Tarik

It is not actually difficult to demonstrate why using the whole dataset (i.e. before splitting to train/test) for selecting features can lead you astray. Here is one such demonstration using random dummy data with Python and scikit-learn: import numpy as np from sklearn.feature_selection import SelectKBest from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics … Read more

Train Tensorflow Object Detection on own dataset

May 14, 2023 by Tarik

This assumes the module is already installed. Please refer to their documentation if not. Disclaimer This answer is not meant to be the right or only way of training the object detection module. This is simply I sharing my experience and what has worked for me. I’m open to suggestions and learning more about this … Read more

How would one use Kernel Density Estimation as a 1D clustering method in scikit learn?

May 12, 2023 by Tarik

Write code yourself. Then it fits your problem best! Boilerplate: Never assume code you download from the net to be correct or optimal… make sure to fully understand it before using it. %matplotlib inline from numpy import array, linspace from sklearn.neighbors import KernelDensity from matplotlib.pyplot import plot a = array([10,11,9,23,21,11,45,20,11,12]).reshape(-1, 1) kde = KernelDensity(kernel=”gaussian”, bandwidth=3).fit(a) … Read more

Keras Text Preprocessing – Saving Tokenizer object to file for scoring

May 10, 2023 by Tarik

The most common way is to use either pickle or joblib. Here you have an example on how to use pickle in order to save Tokenizer: import pickle # saving with open(‘tokenizer.pickle’, ‘wb’) as handle: pickle.dump(tokenizer, handle, protocol=pickle.HIGHEST_PROTOCOL) # loading with open(‘tokenizer.pickle’, ‘rb’) as handle: tokenizer = pickle.load(handle)

What is the difference between loss function and metric in Keras? [duplicate]

May 9, 2023 by Tarik

The loss function is used to optimize your model. This is the function that will get minimized by the optimizer. A metric is used to judge the performance of your model. This is only for you to look at and has nothing to do with the optimization process.

Why does one hot encoding improve machine learning performance? [closed]

May 8, 2023 by Tarik

Many learning algorithms either learn a single weight per feature, or they use distances between samples. The former is the case for linear models such as logistic regression, which are easy to explain. Suppose you have a dataset having only a single categorical feature “nationality”, with values “UK”, “French” and “US”. Assume, without loss of … Read more