I'm building a prediction model that takes user input for different movie attributes and returns the gross revenue. I have two questions:
I am getting a ValueError for reshaping. I'm not sure where the .reshape(-1,1) goes.
Where, if anywhere, do I specify the Y variable (gross revenue) or does the code below produce that automatically?
notOver = True
while(notOver):
user_movie = input("Enter the name of your movie: ")
user_genre = input("What is the genre of your movie? ")
user_budget = input("What is your movie's budget? ")
user_runtime = input("What is your movie's runtime? ")
user_studio = input("Who is your studio partner? ")
user_director = input("Who is the director of your movie? ")
user_writer = input("Who is your movie's writer? ")
user_ohe = ["user_genre", "user_studio", "user_director", "user_writer"]
transformer_ohe = OneHotEncoder(handle_unknown="ignore")
user_num = ["user_budget", "user_runtime"]
preprocessor = ColumnTransformer(transformers=[("cat", transformer_ohe, user_ohe)])
steps = Pipeline(steps = [('preprocessor',preprocessor), ('classifier',Ridge())])
steps.fit(user_ohe, user_num)
notOver = False
Error:
ValueError: Expected 2D array, got 1D array instead:
array=['user_genre' 'user_studio' 'user_director' 'user_writer'].
Reshape your data either using array.reshap开发者_如何学JAVAe(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
I'm new to this so your help is appreciated! Thank you :)
scikit-learn works with 2D arrays. Yours is one-dimensional. To make a dataset use user_ohe.reshape(-1, 1) or enclose the array in square brackets [user_ohe]. Please note that the labels must be in a one-dimensional array, as far as I understand you have this: user_num.
精彩评论