One-hot encoding with model.matrix. Is the intercept required?
I understand what one-hot encoding does in converting a factor with k levels to k-1 dummy variables, but what I'm confused about is whether the intercept is required to be specified or can be left out. For example, this removes the intercept:
# Predictor variables of train dataset
x <- model.matrix(y ~ ., train_data)[,-1]
But the model output seems the same regardless of whether I remove it or not.
r r-caret
add a comment |
I understand what one-hot encoding does in converting a factor with k levels to k-1 dummy variables, but what I'm confused about is whether the intercept is required to be specified or can be left out. For example, this removes the intercept:
# Predictor variables of train dataset
x <- model.matrix(y ~ ., train_data)[,-1]
But the model output seems the same regardless of whether I remove it or not.
r r-caret
I'm not sure exactly what you're trying to do, but you can leave the intercept out likemodel.matrix(~ a + 0, data=data.frame(a=factor(1:3)))
which will give you a slightly different result.
– thelatemail
Nov 16 '18 at 3:24
This might be better asked over at Cross Validated since you're really just reparametrizing the model. A linear combination of the dummy variables would act as the new "intercept" (provided the rank of the model matrices are the same).
– mickey
Nov 16 '18 at 3:28
Some online examples I'd seen had removed the in intercept with [,-1], but then I thought I had read somewhere you should only do that if you retain as many dummies as there are levels of the factor. I just wasn't sure. I've posted over at Cross Validated.
– LucaS
Nov 16 '18 at 3:41
2
The main idea is that you don't want yourmodel.matrix
to be singular. So it's either the intercept + k-1 dummies, or no intercept and all k dummies. It can be shown that the result should be the same, just with slight differences in parameter interpretation.
– Nutle
Nov 16 '18 at 14:38
add a comment |
I understand what one-hot encoding does in converting a factor with k levels to k-1 dummy variables, but what I'm confused about is whether the intercept is required to be specified or can be left out. For example, this removes the intercept:
# Predictor variables of train dataset
x <- model.matrix(y ~ ., train_data)[,-1]
But the model output seems the same regardless of whether I remove it or not.
r r-caret
I understand what one-hot encoding does in converting a factor with k levels to k-1 dummy variables, but what I'm confused about is whether the intercept is required to be specified or can be left out. For example, this removes the intercept:
# Predictor variables of train dataset
x <- model.matrix(y ~ ., train_data)[,-1]
But the model output seems the same regardless of whether I remove it or not.
r r-caret
r r-caret
asked Nov 16 '18 at 2:26
LucaSLucaS
317112
317112
I'm not sure exactly what you're trying to do, but you can leave the intercept out likemodel.matrix(~ a + 0, data=data.frame(a=factor(1:3)))
which will give you a slightly different result.
– thelatemail
Nov 16 '18 at 3:24
This might be better asked over at Cross Validated since you're really just reparametrizing the model. A linear combination of the dummy variables would act as the new "intercept" (provided the rank of the model matrices are the same).
– mickey
Nov 16 '18 at 3:28
Some online examples I'd seen had removed the in intercept with [,-1], but then I thought I had read somewhere you should only do that if you retain as many dummies as there are levels of the factor. I just wasn't sure. I've posted over at Cross Validated.
– LucaS
Nov 16 '18 at 3:41
2
The main idea is that you don't want yourmodel.matrix
to be singular. So it's either the intercept + k-1 dummies, or no intercept and all k dummies. It can be shown that the result should be the same, just with slight differences in parameter interpretation.
– Nutle
Nov 16 '18 at 14:38
add a comment |
I'm not sure exactly what you're trying to do, but you can leave the intercept out likemodel.matrix(~ a + 0, data=data.frame(a=factor(1:3)))
which will give you a slightly different result.
– thelatemail
Nov 16 '18 at 3:24
This might be better asked over at Cross Validated since you're really just reparametrizing the model. A linear combination of the dummy variables would act as the new "intercept" (provided the rank of the model matrices are the same).
– mickey
Nov 16 '18 at 3:28
Some online examples I'd seen had removed the in intercept with [,-1], but then I thought I had read somewhere you should only do that if you retain as many dummies as there are levels of the factor. I just wasn't sure. I've posted over at Cross Validated.
– LucaS
Nov 16 '18 at 3:41
2
The main idea is that you don't want yourmodel.matrix
to be singular. So it's either the intercept + k-1 dummies, or no intercept and all k dummies. It can be shown that the result should be the same, just with slight differences in parameter interpretation.
– Nutle
Nov 16 '18 at 14:38
I'm not sure exactly what you're trying to do, but you can leave the intercept out like
model.matrix(~ a + 0, data=data.frame(a=factor(1:3)))
which will give you a slightly different result.– thelatemail
Nov 16 '18 at 3:24
I'm not sure exactly what you're trying to do, but you can leave the intercept out like
model.matrix(~ a + 0, data=data.frame(a=factor(1:3)))
which will give you a slightly different result.– thelatemail
Nov 16 '18 at 3:24
This might be better asked over at Cross Validated since you're really just reparametrizing the model. A linear combination of the dummy variables would act as the new "intercept" (provided the rank of the model matrices are the same).
– mickey
Nov 16 '18 at 3:28
This might be better asked over at Cross Validated since you're really just reparametrizing the model. A linear combination of the dummy variables would act as the new "intercept" (provided the rank of the model matrices are the same).
– mickey
Nov 16 '18 at 3:28
Some online examples I'd seen had removed the in intercept with [,-1], but then I thought I had read somewhere you should only do that if you retain as many dummies as there are levels of the factor. I just wasn't sure. I've posted over at Cross Validated.
– LucaS
Nov 16 '18 at 3:41
Some online examples I'd seen had removed the in intercept with [,-1], but then I thought I had read somewhere you should only do that if you retain as many dummies as there are levels of the factor. I just wasn't sure. I've posted over at Cross Validated.
– LucaS
Nov 16 '18 at 3:41
2
2
The main idea is that you don't want your
model.matrix
to be singular. So it's either the intercept + k-1 dummies, or no intercept and all k dummies. It can be shown that the result should be the same, just with slight differences in parameter interpretation.– Nutle
Nov 16 '18 at 14:38
The main idea is that you don't want your
model.matrix
to be singular. So it's either the intercept + k-1 dummies, or no intercept and all k dummies. It can be shown that the result should be the same, just with slight differences in parameter interpretation.– Nutle
Nov 16 '18 at 14:38
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53330579%2fone-hot-encoding-with-model-matrix-is-the-intercept-required%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53330579%2fone-hot-encoding-with-model-matrix-is-the-intercept-required%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I'm not sure exactly what you're trying to do, but you can leave the intercept out like
model.matrix(~ a + 0, data=data.frame(a=factor(1:3)))
which will give you a slightly different result.– thelatemail
Nov 16 '18 at 3:24
This might be better asked over at Cross Validated since you're really just reparametrizing the model. A linear combination of the dummy variables would act as the new "intercept" (provided the rank of the model matrices are the same).
– mickey
Nov 16 '18 at 3:28
Some online examples I'd seen had removed the in intercept with [,-1], but then I thought I had read somewhere you should only do that if you retain as many dummies as there are levels of the factor. I just wasn't sure. I've posted over at Cross Validated.
– LucaS
Nov 16 '18 at 3:41
2
The main idea is that you don't want your
model.matrix
to be singular. So it's either the intercept + k-1 dummies, or no intercept and all k dummies. It can be shown that the result should be the same, just with slight differences in parameter interpretation.– Nutle
Nov 16 '18 at 14:38