machine-learning – Make Me Engineer

TimeDistributed(Dense) vs Dense in Keras – Same number of parameters

June 14, 2023 by Tarik

TimeDistributedDense applies a same dense to every time step during GRU/LSTM Cell unrolling. So the error function will be between predicted label sequence and the actual label sequence. (Which is normally the requirement for sequence to sequence labeling problems). However, with return_sequences=False, Dense layer is applied only once at the last cell. This is normally … Read more

Recommended package for very large dataset processing and machine learning in R [closed]

June 13, 2023 by Tarik

Can Keras deal with input images with different size?

June 12, 2023 by Tarik

Yes. Just change your input shape to shape=(n_channels, None, None). Where n_channels is the number of channels in your input image. I’m using Theano backend though, so if you are using tensorflow you might have to change it to (None,None,n_channels) You should use: input_shape=(1, None, None) None in a shape denotes a variable dimension. Note … Read more

scikit-learn random state in splitting dataset

June 11, 2023 by Tarik

It doesn’t matter if the random_state is 0 or 1 or any other integer. What matters is that it should be set the same value, if you want to validate your processing over multiple runs of the code. By the way I have seen random_state=42 used in many official examples of scikit as well as … Read more

Java-R integration?

June 11, 2023 by Tarik

Reset weights in Keras layer

June 11, 2023 by Tarik

Save the initial weights right after compiling the model but before training it: model.save_weights(‘model.h5’) and then after training, “reset” the model by reloading the initial weights: model.load_weights(‘model.h5’) This gives you an apples to apples model to compare different data sets and should be quicker than recompiling the entire model.

TfidfVectorizer in scikit-learn : ValueError: np.nan is an invalid document

June 10, 2023 by Tarik

You need to convert the dtype object to unicode string as is clearly mentioned in the traceback. x = v.fit_transform(df[‘Review’].values.astype(‘U’)) ## Even astype(str) would work From the Doc page of TFIDF Vectorizer: fit_transform(raw_documents, y=None) Parameters: raw_documents : iterable an iterable which yields either str, unicode or file objects

Pattern recognition in time series [closed]

June 9, 2023 by Tarik

Here is a sample result from a small project I did to partition ecg data. My approach was a “switching autoregressive HMM” (google this if you haven’t heard of it) where each datapoint is predicted from the previous datapoint using a Bayesian regression model. I created 81 hidden states: a junk state to capture data … Read more

How to find the importance of the features for a logistic regression model?

June 9, 2023 by Tarik

One of the simplest options to get a feeling for the “influence” of a given parameter in a linear classification model (logistic being one of those), is to consider the magnitude of its coefficient times the standard deviation of the corresponding parameter in the data. Consider this example: import numpy as np from sklearn.linear_model import … Read more

TensorFlow, “‘module’ object has no attribute ‘placeholder'”

June 7, 2023 by Tarik

If you have this error after an upgrade to TensorFlow 2.0, you can still use 1.X API by replacing: import tensorflow as tf by import tensorflow.compat.v1 as tf tf.disable_v2_behavior()