<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>learning Archives - AI Betting Edge</title>
	<atom:link href="https://aibettingedge.com/tag/learning/feed/" rel="self" type="application/rss+xml" />
	<link>https://aibettingedge.com/tag/learning/</link>
	<description>MLB and NBA Prop Bets</description>
	<lastBuildDate>Sun, 21 Aug 2022 10:58:40 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>
	<item>
		<title>Calculating Season Averages for NBA Betting Features</title>
		<link>https://aibettingedge.com/calculating-season-averages-for-nba-betting-features/</link>
					<comments>https://aibettingedge.com/calculating-season-averages-for-nba-betting-features/#respond</comments>
		
		<dc:creator><![CDATA[Shon Butani]]></dc:creator>
		<pubDate>Fri, 15 May 2020 14:04:47 +0000</pubDate>
				<category><![CDATA[Learning]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Trends]]></category>
		<category><![CDATA[betting]]></category>
		<category><![CDATA[learning]]></category>
		<category><![CDATA[nba]]></category>
		<category><![CDATA[python]]></category>
		<guid isPermaLink="false">https://demo.creativethemes.com/blocksy/app/?p=270</guid>

					<description><![CDATA[<p>In this post, we&#8217;re going to dig into some feature engineering. At it&#8217;s core, any NBA betting model is a time series analysis problem. You&#8217;re trying to predict what a player will do tonight, given what he&#8217;s done in the past. The &#8220;given what he&#8217;s done in the past&#8221; is what we call feature engineering. [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://aibettingedge.com/calculating-season-averages-for-nba-betting-features/">Calculating Season Averages for NBA Betting Features</a> appeared first on <a rel="nofollow" href="https://aibettingedge.com">AI Betting Edge</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>In this post, we&#8217;re going to dig into some feature engineering. At it&#8217;s core, any NBA betting model is a time series analysis problem. You&#8217;re trying to predict what a player will do tonight, given what he&#8217;s done in the past. The &#8220;given what he&#8217;s done in the past&#8221; is what we call feature engineering. In this post, we&#8217;re going to walk you through how to take some basic stats and turn them into season average features, while preventing the dreaded data leakage problem.</p>



<p>First, as always (most of the time), we import pandas. That&#8217;s surprisingly the only package we&#8217;ll need for this blog post.</p>



<pre class="wp-block-code language-python"><code>import pandas as pd</code></pre>



<p>Next, let&#8217;s define the features we want to compute season averages for. For simplicity, we chose the basic box score stats.</p>



<pre class="wp-block-code"><code>stats = &#91;'PTS','REB','AST','TOV','3PM','STL','BLK']</code></pre>



<p>Now let&#8217;s dig into the logic. There are a few relatively straightforward preprocessing steps that need to be performed before computing the season average. But before digging into the code, let&#8217;s talk a little about what we DON&#8217;T want to do. Imagine the NBA season is 5 games long. Monday, Tuesday, Wednesday, Thursday and Friday. LeBron James scored 20, 25, 30, 35 and 40 points on those 5 days. We pass in all 5 games as part of our training data (the data the algorithm learns from), and the only feature we use is his season average for points.&nbsp;<em>What is</em>&nbsp;his season average? You may quickly think, oh it&#8217;s just (20+25+30+35+40)/5 = 30 points. But is his season average really 30 points for all of those 5 game dates? To the algorithm, when it&#8217;s predicting Wednesday, the games on Thursday and Friday have not happened yet. So really for Monday, there is no average, because there&#8217;s no &#8220;yesterday&#8221;. For Tuesday, his average is 20 (Monday&#8217;s total). For Wednesday, it&#8217;s 22.5 (taken by averaging Monday and Tuesday&#8217;s outcomes) and so on and so forth.</p>



<p>With that being said, let&#8217;s dig into the code. We first need to convert the &#8220;GAME_DATE&#8221; collumn to a pandas datetime object. Next, we need to sort based on three columns, first by &#8220;SEASON_ID&#8221;, second by &#8220;PLAYER_ID&#8221;, and third by &#8220;GAME_DATE&#8221;. The logic behind this is straightforward: We want each player&#8217;s per season stats, sorted in ascending order by game date (to prevent data leakage as mentioned above).</p>



<p>After the sorting is done, we do a groupby on &#8220;SEASON_ID&#8221;, &#8220;PLAYER_ID&#8221;, and &#8220;TEAM_ID&#8221; to ensure that if a player gets traded midseason, we&#8217;re counting his stats only with the current team he&#8217;s on (as per the game date in the row). We apply a lambda function to the stats columns we defined above, and shift everything by 1 day to make sure that we&#8217;re not accidentally counting the player&#8217;s stats from that day towards his averages for predicting that day (again a data leakage problem). For each unique SEASON_ID, PLAYER_ID, GAME_DATE group, the first row should be null. That&#8217;s because of the shift(1) we performed above.</p>



<pre class="wp-block-code"><code># full_df is a dataframe with the player stats on each game_date

full_df&#91;"GAME_DATE"] = pd.to_datetime(full_df&#91;"GAME_DATE"])
full_df.sort_values(by=&#91;"SEASON_ID", "PLAYER_ID","GAME_DATE"], inplace=True)
full_df.reset_index(inplace=True,drop=False)

moving_average_dataframe = full_df.groupby(&#91;"SEASON_ID", "PLAYER_ID","TEAM_ID"])&#91;stats].apply(lambda x: x.expanding().mean().shift(1)).reset_index(drop=True)
moving_average_dataframe.columns = "1_day_lag_season_average"+moving_average_dataframe.columns
full_df = full_df.join(moving_average_dataframe)

full_df.drop(columns=&#91;'index'],inplace=True)</code></pre>



<figure class="wp-block-image size-full"><img fetchpriority="high" decoding="async" width="752" height="419" src="https://aibettingedge.com/wp-content/uploads/2022/05/nba-season-average.png" alt="nba-season-average" class="wp-image-438" srcset="https://aibettingedge.com/wp-content/uploads/2022/05/nba-season-average.png 752w, https://aibettingedge.com/wp-content/uploads/2022/05/nba-season-average-300x167.png 300w" sizes="(max-width: 752px) 100vw, 752px" /></figure>
<p>The post <a rel="nofollow" href="https://aibettingedge.com/calculating-season-averages-for-nba-betting-features/">Calculating Season Averages for NBA Betting Features</a> appeared first on <a rel="nofollow" href="https://aibettingedge.com">AI Betting Edge</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://aibettingedge.com/calculating-season-averages-for-nba-betting-features/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
