Journal of Siberian Federal University. Engineering & Technologies, 2018, 11(6), 694-710
yflK 621.391:004.8
A Game Based Energy Sensitive
Spectrum Auction Model and Bid Learning Process
for Cognitive Radio Systems
Abdulkarim Ayopo Oloyede*
Department of Telecommunication Science
University of Ilorin Ilorin, Nigeria
Received 25.07.2017, received in revised form 30.08.2017, accepted 06.01.2018
An auction based bid learning process for cognitive radio networks, where the users and the service providers are learning about each other to maximise each other's utility is examined. A game model is formulated to allow players to learn depending on their priority. This enables users to learn different parameters such as the best offered bid price and the appropriate time to participate in the auction process. The performance of the system is examined based on the developed utility function. The results show that the blocking probability, utility function and the energy consumed is better with the learning users when compared to the non-learning users. Results also show that provided learning is taking place in the system, Nash Equilibrium can be established.
Keywords: spectrum auction, dynamic spectrum access, learning based auction, utility function.
Citation: Oloyede A.A. A game based energy sensitive spectrum auction model and bid learning process for cognitive radio systems, J. Sib. Fed. Univ. Eng. technol., 2018, 11(6), 694-710. DOI: 10.17516/1999-494X-0086.
© Siberian Federal University. All rights reserved
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). Corresponding author E-mail address: [email protected]
Игровая интуитивная спектральная модель аукциона и изучение процесса подачи заявок на когнитивные радиосистемы
Абдулкарим Олоиеде
Департамент телекоммуникационных наук
Университет Илорин Нигерия, Илорин
Изучается процесс обучения на основе аукциона для когнитивных радиосетей, где пользователи и поставщики услуг узнают друг о друге, чтобы максимизировать полезность друг друга. Игровая модель сформулирована так, чтобы позволить игрокам учиться в зависимости от их приоритета. Это дает возможность пользователям изучать различные параметры, такие как наилучшая цена предложения и подходящее время для участия в аукционном процессе. Производительность системы проверяется на основе разработанной функции полезности. Результаты показывают, что вероятность блокировки, функция полезности и потребляемая энергия лучше у пользователей обучения по сравнению с пользователями, не участвующими в обучении. Результаты также показывают, что при условии, что обучение будет проходить в системе, может быть установлено равновесие Нэша.
Ключевые слова: спектральный аукцион, доступ к динамическому спектру, аукцион, основанный на учебе, вспомогательная функция.
I. Introduction
The huge shift to wireless communications brought about by the advent of smartphones and related devices is leading to congestion of the radio spectrum. The cause of the congestion is however mainly associated with the traditional fixed spectrum allocation schemes put in place by the different regulatory authorities [1, 2]. This led to the concept of Dynamic Spectrum Access (DSA) as proposed in [3]. Furthermore energy efficiency is a key factor in future wireless network because of climate change [4, 5]. In addition to this, the concept of Cognitive Radio Networks (CRN) has also been proposed in [6]. Consequently to complement the dynamic network, increase the revenue in relation to the increase in demand for expansion purposes and management of the occasional congestion as a result of people congregating in a single location such as during a football match, the Olympics or other events, dynamic pricing using the concept of an auction was also introduced. An auction process is important because, over the years the price paid for the spectrum has been based on potential price rather than allowing competition to reflect the actual price for the radio spectrum. Hence, this resulted into a growth in demand for the radio spectrum without a corresponding growth in revenue [7].
The implementation of a heterogeneous network requires proper planning in terms of pricing, licensing period and the power allocation mechanism among others to deliver the expected gain. However, the primary users of the radio spectrum are still not willing to share the radio spectrum based on the concept of DSA. This is because of concerns about interference from secondary users. Therefore, to encourage the efficient use of the radio spectrum for secondary access, [8] has previously
proposed the use of the green payments (GP) as an incentive for efficient use of the radio spectrum in and an auction based balancing on revenue and fairness was proposed in [9]. This paper uses the already proposed green paymnets to fomulate this work. This paper also examines a novel concept of a game based model in combination with an auction process to characterise the interactions that exist between the different competing elements in an auction based DSA network. This is done to reduce the amount of energy consumed in the system. The use of these two concepts to model a DSA network can also be found in [10-13].
The remaining parts of this paper are organised as follows: Section II defines some of the new and important models used in this paper. Section III defines the utility function adopted. Section IV shows a modelling scenario with the game model. Section V gives the results and discussion while the last section is the conclusions and future work.
To model a heterogeneous network, the users in this paper are divided into two groups, the High Powered Users (HPUs) and the Low Powered Users (LPUs). The HPU requires a higher quality of service when compared to the LPU. Just these two categories are compared for simplification purposes. Furthermore we consider the presence of the service provider called the Wireless Service Provider (WSP) whose responsibility is to provide radio spectrum access to the users. These three entities considered form the players in the game model.
The Energy Model
The energy model is represented as a 2 state Markov chain shown in Fig. 1 and explained thus:
1. A user who has file(s) to send moves into the OFF state and continue to be in this state until such user is among the winning bidders.
2. A user who is among the winning bidders moves from the OFF state to the ON state.
3. The user remains in the ON state until after transmission if transmission is successful or until when the user receives a failed signal either due to low offered bid compared to the reserve price or due to poor quality channel.
4. After transmission the user moves back to the OFF state before switching completely off if no file is to be sent again. However if the user has another file to send, the user remains and attempt again in the off state. The complete off mode (not in Fig. 1) is the mode a user is in when there is no file to be sent.
A processing time whichis the timetakento processthe received bidis alsoassumed.All users that move from the ON state to the OFF state have the same processing time.
II. System Model and Parameters
Fig. 1. Energy and system model as a two state Markov chain
Clo Reserve Prioe
The reeeryeprice i^kthe n^in^m^jr^ j^ric^n to be pnidbo nny usay intending to transmit before the spectrum is allocated to such a user. When the demand is low the reserve price helps to retain the mmimumeelli^pricu ofaheWp°as kRowRin[g]. It iaformupatcdby tuMng mdo account the current traffic load in the system, the fre quency band, the total number of channels and the number of channels in use as:
RP(PriceUnit) = CfNTCCr . (1)
Where Cr is a constant in price unit which is used to specify the value of a spectrum band in use. This value is determined from the common knowledge regarding the common price of the radio spectram and ic is spe cifie dinparameters Table 1. The usersbclieve that the bigger the size of the network, the better the quality of service offered hence, the total numeerofchannels in the system is also taken into consideration when aadculating thk reserve price, "the congestion factor (Cj as shown below is introduced because of the laws of demand and supply as explained in [14]:
Cf=^USA. (2)
' i y l
TCe UsersBid
In an aecrion proeess, the bid of c uvrr is emportant as it determinas if the us^r wms or loves at tlae end ctF the process. To simplify the bidgenerationprocess,a concept catted the Offered Bid Bin fOESB) ie introduced. The OBB is Hkr a kritety/eaffle basket containing rUfferaBt I)kl vahiei. A bidder dipsinto the bin (dependiag on the belief ofthe ereri andpiske a gid kv^l^t;. SS i s asrumed that Abs bins are available in the system and they are arranged in anascendingcodes Eachbrn contains a erecifiad ranpe onnoftinuout vaiues jOBB- < aaB2 < OBB3 ... OBBAbs. This means that aTid picOed
OBB OBB O BB OBBA,
from OBB2 is greater than a bid from a bidpibknd from ngg1 (bi .....bi "bs)
Where bi s isthebid value pic kid by crvc i from Oaa^.
A user intending to seek access to the radio spectrum picks a bid from any of the bins depending on the user's belief regarding the values of the bids submitted by other users in the system. It is quite similar to the traffic load bin used in [15]. However, unlike in [15] where the bids are assumed to be a discreet value, here the values are real numbers. The OBB is formulated as explained because the assumption in [15] that a user knows the system's traffic load might not always be true, as such informationisavailablemainlyto theWSP.
TheUsers Belief
As stated earlier, the offered bid of a user depends on the belief of the user regarding the bids of others.Twobeliefsmodelsareproposed,thegreedyand thelearningmodel.
TheGreedyorNon-learningProcess
A user using the greedy model is assumed to be myopic and only intends to maximise its utility by bidding using a low price value. Such a user is known as extremely price sensitive bidder [16]. The bidder does not mind wasting energy by losing the auction process. Hence, it is assumed here that such a user is not learning the bid of the others or the reserveprice.
WSP
Objective: Learns the best reserve price that helps in maximising profit
LRU
Objective:
price to win the auction process
HPU
Objective:
best bidding price to offer when participating in the auction process
Fig. 2. Summary of the learning process
The LearningProcess
Learning about the optimal bidding price can be useful to control the traffic load in the system especiallywhen the system is congested in addition to the reduction in consumed energy and delay as demonstrated in [17]. Users that use the learning model are assumed to be interested in always winning or not wrsting energy.
LPU Learning
AdPU eeieivcs a form ot subsity using thehrecnncymevi equttion aeexplaincd in [8] (white the HPUs are taxed using the same green payment equation). It is assumed that the LPU are provided wtih toe informatiunatouttlee^evtius tads of ttse HPU m selditione^ the myentivn receded from theWSP. Thisinformation isusedby ihsWPUas tteprior information during the learning process. The WSP provides such information only to the LPU because as shown in [8] the WSP prefers the LPU transmitting rather than the HPU to keep interference in the system low.
HPU Learning
A HPU can only learn about the bids of the LPU based on an estimated prior knowledge while using theBaussian learning moVnl [11]. Whe HPU lcacn lo cnUyrsahnh when iOeLPUace noftrrnemitting to increase their chances of winning the auction process (Fig. 2).
WSP Learning
The information available to the WSP is the bids submitted by the users. The aim of the WSP is tomaaimise rcaenue.nherePoec, the Wauioatns iVi user'sroseructton price. Uhe reservation peic e is determined by the user's budget as explained in [19]. If the reserve price is higher than the user's resecvotion price ttien th uteo is olele to jDity lism^ the nicecfrnm fe trt utiUced. Oe thevthre haad, if there is congestion in the system, the WSP can increase the reserve price to prevent more users attempting to transmit.
III. The Utility Function
Tye oriKtyfuactfonpiktys an smportint rote ihdstermining thoachievaMe pecformance of a system. It describes the level of satisfaction or the preference of a user based on the QoS received [20]. It can be used in radio resource management to determine the level of satisfaction of the
users. The utility funcdion can be describ(2d using diCderentways, bvU tBo ohoice of ahe function is critical in achieving the desired performance. In this paper, it is defined for each set; of pkyers uyEeg a power utility function because of its rapidly increasing nature. All the players are assumed to be rariunal and tUey seek to maximize their utiiity. The utility eunctiun nf the usurcio niyuUed into four parte: the attlity basel on nhe IticL value t01a), the utility baaed on tye OBB (UOBB), the utilitu Ba eu^nihsenesgy consymetl pcr :file sent {mE) and tde utilhy bdcaol on the green payments
(Oh-
Bf/7/gc in Termg of ilia OSSB
Tlie higherlhe OBB au serplcPs a_liid:f^<r]i:n.l thu loweyiae utility yftpe user in terms of the OBB. Taiis meauis tliut b naed that jrisks n bieU froaoi OBBx has a tiS|fli^:r uailhu vahoe an terms on etic OBiCcomnsnsn So a user thatpicks sbtf Srom OZT; or higher (it( 0 BB^) < U^atgES^^ .... ,U(OBB2C < ¿/(OR^)). Thas is Bucauss is op asiuineid ^liat the ugisrs are prin n sensstrvn aud ehe uceroeimieUo wlnwiih ^Ih^^^ast pessibCe cmoant.
U0BB = 20Be>is« - 1, (3)
Where OBB, is the bin where user i picks a bid and OBBAbs isthe bin containing the maximum possible bids. The bin (OBBAbs) that contains the set of maximum possible bid values has the least utility. 0j5BU6s+1 is used as the denominator in order to avoid a user picking a bid from OBBAbs and havinga utility of zero.
Utility in Termsof theActualOffered Bid
TZe utility in terms of the acluaZ oiferrd bid allows us to differentiate retwzen userupicking o tow value ofthe biOdo thsaedsking o hifh value ftam thesame OBB. Aoaa illustration, a user ofiferiaga bid of 5.55 picked from OBB5 has a lower utility compared to a user picking 5.95 from the same bin. The gtilidy is focmulated aseliswit beOow, where set NWh represents the winning bids in a bidding round
^Wiy "" {^V 02' f ■■■ ^A^t/}' (4)
5 _ r (maxfBNwu) - minC/V^) /or b, < max <h^V'Irbijr;) (5)
(max/hnyu + dk- min (^Odj.i^iy]) for bj Is max (JVWy
^ 1 ifabidde0wins ^
i^J5=1 2 6 . , (6)
(i q otheowise
bi is the bid of any user i. If a bidder is not among the winning bidders, the utility of such a user is zero. The lower part of equation 5 contains a fixed value dk which is specified in the parameter table. This is used for the user with the maximum bid to prevent a user from having a utility function value of zero. The value of dk is picked to be quite small so that it does not affect the utility of the highest bidder.
Utility in TermsofEnergyConsumed During the Bidding Process
From the energy model, the more efficient a user is in terms of offering a bid that is accepted by the WSP, the more energy efficient the user is. A user whose bid is never rejected is considered to be
more energy efficient compared to a user whose bid is sometimes/often rejected. This is because a user can only participateinthe bidding process when in the ON state as explained earlier. It is measured as shown below:
(Nps\
E=2\nfc) - i.
UE=2VW-i. (7)
Where NFS is the number of times a user has sent a file successfully, NFG is the number of times a user i has attempted to send a file but the users bid was rejected as a result of price. A rejected bid as a resultofother factors(apartfromprice)is notconsideredas part of Fi.
Utility in Terms of the Green Payments
The concept of thegreen payme ntswasformulated in [8]. The utility in terms of the green payments is set to determine the satisfaction of the user depending on the value of the received green subsidy. The higher the amount of green payments subsidy received, the higher the utility of a user in terms eft lie green payment. Honever,it is? assumed that a ueerpaying a tax nasa utility value of zero in terms of the greenpayment.Thisis done toallow for the simplification of this work rather than havinga negativeutility.
»==(2 w-i for Gree?i .Subsidy (8)
[ 0 For Green tax
Ri is the gretn paymenttax/subsidyfor user i respectively, Rmax isthe maximumsubsidy.
The Overall Utility of the User
Theoverallutilityof each of the user can vary between 0 and 1as shown below:
tt=—a-pfr-A (9)
2+— a
Where m can vary between 1 and 2. This is done in order to vary the impact of UR and UOBB on the utility value. m is specified in the parameters Table 1. It is introduced to reduce the weight associated to the utility in terms of the green payments and the OBB because it is assumed that they have less impact on the general utility of ttee users in this model. The components of the utility function that has less impact depend on the on the service offered by the system. This is because the satisfactions derived by usersvary with the ulfertd s«2r^ec^. Tfe peak point in Fig. 3 might be difficult to achieve because a user might prefer one factor more than the others, depending on the application in use. It can be as shown below.
Utility of the WSP
The utility of the WSP is based on the totalrevenue obtained. It is as shown below:
wcen/(e)
ui(t)=2NTc(t) _i. (10)
Where NCAU(t) is the total number of channels that was available and used up till time t and NTC(t) is the total number of channels that was available in the system up till time t. It is assumed that if a channelis notoccupied,the WSPislosing somerevenue.
- 700 -
Table 1. Parameters used
Parameters Value
Cell radius 2 km
Interference threshold -40 dBm
Users in a cell 200
Number of cell 19
Noise floor -114 dB/MHz
SINRmax 21 dB
SINRthreshold 1.8 dB
Cr 0.7
Max number of channels per cell 4
Height of base station 15 m
Height of mobile station 1 m
Budget 100000 Price Units
Transmit power for users 0.9 W/bit
Energy consumed by device 0.5 Watt sec
Power used in bidding 0.25% of the transmit power
Abs 12
dk 0.001
(Ü 1
Fig. 3. Illustration of the Utility Function
IV. The Modelling Scenario
A cognitive network with users seeking access to the spectrum in an opportunistic manner is modelled, where NUSA out of the possible N users in the system are competing for NAC unlicensed channels (where NAC is the number of available channels). A multi-channel scenario (NAC > 1) is modelled using an uplink scenario. The bid of each user is either taxed or subsidized using the concept
- 701 -
Fig. 4. System Flow Chart
of green payments as described in [8]. The channel is allocated to the highest bidder(s) represented as NWU usingthe firstpricesealed bid auction with a reserve price as explained in [21]. The WINNER II B2 propagation model is used as detailed in [22]. The parameters used in the simulations are as given in Table 1.
ThetruncatedShannone quation is usedto model the transmission rates of eachof the use rsas detailed in [23]. The flow chart is as shown below (Fig. 4).
The Game Model
The game model is used to examine the utility of the learning users compared to the non-learning useru.Tfis sdetknnnlse iuvestigates ifa ptejurcaumcre ase then" utiHty by fnilaterally changing from the learning model to the non-learning model or the other way round. The already formulated utility functiona as expMned tseuped.
A game model is used to study the allocation of the spectrum to obtain a satisfactory and a fair energy efficient auction based mechanism. This paper assumes a game which can be represented as rtcpSe T" [P A, b], Where P represents tOe set erprneerpin tOe game, /lee/resents theaetof actions that is available to the players and U is the payoff or the utility obtained by taking an action. The pleats are repre8eitt° aeP = [G^, G^, We. Wpere, G^^se^fsents tpd HPU, h ¿Pf/ redre.eiltS the LPU and W represents the WSP. Two actions are available to the players to either learn or use the greedy/non-learning approach A = [A1, Ag]). Each of the players aim is to maximise the obtained utilityby biCdinn using the bid value thcCoíforp tPemaximnm poss/OCeutsiity. Tte utility ofthtWfT depends on the revenue received as explained earlier. The players in the same group form a coalition
usi ngtransferle arning.I nihiscoalition,theiesl^^re information such as the optimal OBB with each other. The aim of the game is to examine how a Nash Equilibrium can be achieved.
Eerh groi^efpteyerc snn chooit dilferear actione^' or^h aul Chi pteyers sc thn same group can onfy chnoie or use sle same aciion iai in inclinE reung. Thie meaasi thai if thn GiPg decides to learn, all the users in the group are learning. If GLPU is not learning then no user in that group can decide to lnrrnTMs ii cheromefor G^ angrhoWnP.
In the game formulation, a player belonging to GLPU learns the optimal bid value by learning based on the prior probability provided by the WSP using Bayesian learning or adopting the ntendu model.EachStsgs rnn decide ioiSo unr rha greedamoint Oy lcarnina tfe likelihood of being among the highest bidder and stays out if the likelihood is low. Depending on the value of the likelihood, the number of HPU that should attempt to bid during the next bidding round is netermigsd. Tie equation oh Ihe tikelrhacd is focmulain d mncSthat the numb er oOHPU attempting depends on the available channels and the offered bid of the users. This prevents a situation where ihe use ns nre atiemjrtiog io ccness the channds withehhrr a lowvahic aftlfe red!)id or when few ehonueUs ure htailaise iithossstem. This inbsaause in sucii scenarioi, itin nasi likely that the cha nnels would be allocated to the LPU who are also attempting during the same bidding round. Theformulaiisu is sis stowi bslow:
Pr(0 = Gr^^^'ty™ > Nac . iSl)
vmax °m
lOni^cre: te is it vaius of tOe reheno jirice ii" known to the user otnerwise it is the minimum possiblebidbyuser t'tosed ou thebudget (ifSSie user F^klhemaximum^iss11) lr i^^^i^^tioefeia use r per lite and J- is trih bid lor uier T't ii cakutethd ^or aH r°; OTU users. If too protobHity
is hitnii for allthe entPhJ atiempling ten ttansmtt, tien thej pirh aliowed, bul if ip trltr^, snly a fraction are allowedus shown in eqeotion (st). Thbueeisallowedase pickedindercenOiug order ofthepsobability. Thgnumbo;s aUawed d(^fKno oniaeauieYti^^ usero on^ift numters of chfnoeteavmlaMe. TMs is because at low traffic loads more HPU can be allowed, the numbers allowed decrease as the traffic load increase. It i s as shown below:
NSsAHPU(n = PrNSrSAHPU(f). (12)
Where N^lAHpu{t) is the total number of HPU who arrived and wants to transmit during a trncenasrion period t, NHSAhjU is the number of arriving HPU that are allowed to attempt to transmit after multiplying by the probability and Pr here is probability calculated from equation (12). This shows that the higher the likelihood, the higher the number of HPU allowed into the system. However, using the equation to determine the number of users allowed is not optimal. Therefore, the HPU varies the probability (Pr) in equation 12 and learns the optimal value for each traffic load provided Pr is positive initially. The equation is used in generating the prior probability and it serves as basis for the learning process. The HPU users use Bayesian learning as explained in [17] to learn the optimal number of users to be admitted into the system by exploring different numbers starting from the minimum provided by equation 12. Furthermore, the WSP also learns the traffic load which is used to fix the reserve price. When the system is congested (at traffic load of 4 Erlangs and above) the reserve price is fixed in such a manner that only bids from the highest OBB can be above the
reserve price. Therefore, the HPU paying the green tax are denied complete access to the spectrum. In this model it is assumed that that WSP is also learning the traffic load in this system using that Bayesian learning model in order to fix the appropriate reserve price. Below are the summary of the assumptions:
• Players are rational and are seeking the best action which they understand to be the actions that maximise their utility;
• AllOhnxOayers wta nreusees(Gdtt,Oy;PfiihaveOhe same budget (B) per file and no user can spend above his budget under any condition;
• Aparticspnti ngusen ix eachgroup oubmins a bid (b1, b2, b3 .... bNusA) where NUSA is the number of users submitting a bid;
• All users in the same group pick the bid value using the same OBB provided they are bidding inthe samebidding round;
• All the players can either chose to learn or adopt the greedy approach.
Examining the performance of the system using the modelling scenario, Fig. 5 shows the utility obtained by the HPU and the LPU against iteration at 3 Erlangs. In the game formulation, the LPU learn the OBB that gives them the highest utility while the HPU learn the traffic load in the system. A traffic load of 3 Erlangs is used in the game formulation because at 4 Erlangs the HPUs are never allowed to transmit in the system as explained earlier. Therefore, no results can be obtained for the HPU.
The utility obtained by either the LPU or the HPU increases as the learning progresses. However, at the 20th iteration the utility of the HPU decreases because the HPUs are exploring the possibility of allowing more HPU to attempt to transmit but such users are unable to transmit therefore the utility in terms of UE reduces. It is worth pointing out that throughout the game formulations it was assumed that the HPU has learnt the best OBB to use and is only picking bids from the best OBB. Therefore, UOBB for the HPU is constant. The utilities obtained by the LPU are more than that of the HPU because the LPU are giving more priority to transmit compared to the HPU because of the green payments. The above figure showed the utility of each user that is learning. The results if one of the players is deviating from the learning process is now showed in order to examine the effects of
V. Results and Discussion
0.9
0.8
0.7
£ :g
0.6
0.5
0.4
0
5
10
15 Iteration
20
25
30
Fig. 5. Utility of HPU and LPU when both are learning
o. o. o. o.
3 o.
2.5 3 3.5
(b) Traffic Load (Erlang)
Fig. 6. Utility for all the 3 players learning and utility for one player deviating
such user deviating. Fig. 6 (a) shows the average utility obtained by all the users in the system when all the 3 players are learning and the average utility when one of the three players is deviating from the learning model. The average for one deviation is shown because on the average, the utility graph of any player deviating looks similar. Hence, the three utilities are summed together and the average is used. It can be seen that if one of the players is deviating, the utility is lower compared to when all the users are learning. This is because if any of the players is not learning, energy is wasted and the utility obtained is lower. Fig. 6(b) shows utility obtained with all three learning. As the traffic load increases, the utility obtained reduces due to the increase in traffic load and a reduction in the utility of the users.
Figure 7 (a) shows the average energy consumed by the system when the LPU and the HPUs are learning. The LPU consumes less energy compared to the HPU. This should be expected because of the difference in their transmit powers. As the learning progress, the energy consumed is reducing. This is because the users are learning to use either the optimal bidding price to find out the appropriate number of users to be introduced into the system depending on the traffic load in the system.
While Fig. 7 (b) shows the utility based on the total energy consumed by the system (both HPU and the LPU) when all the users are learning and the average energy when one of the user is deviating from the learning model. It can be seen that the average energy consumed with one deviation is significantly higher. This is because when one of the players is not learning, the energy consumption level of the players is increased compared to when all the three players are learning. The learning process gets better for the learning players as the number of iteration increases and the amount of energy consumed reduces until the best utility is obtained.
Figure 8(a) shows the average energy consumed per file sent against traffic load with all three players are learning, the average with one of the users deviating from the learning model and when none of the players are learning. It can be seen that as the traffic load increases, the energy consumption increases for all the scenarios. This is because as the traffic load increases the collision and activity in the system increases. When all the three players are learning the average energy consumption is lower and the reason is the same as explained for Fig. 7. It can be seen that using
i S 1
........
..........................."
............
........ .......................... "
......................
1.5 2 2.5 3 3.5 4 4.5 5
2.5 3 3.5
(a) Traffic Load (Erlang)
All Learning
■ ■ Average With One deviation
■ D ■ ■ No Learning
---
J*.................
f.............
......................... ...........
......... ........
L .....6.................w.....
2.5 3 3.5
(b) Traffic Load (Erlang)
Fig. 7. The Average energy consumed by LPU and HPU (b) The Average energy consumed by all learning and averagewithone oftheplayersdeviating
2
5
Ui 0
* • •
1.5
2
4
4.5
5
£ 6>
n»""
t . i 3 0.6
0.5 0.4
2.5 3
(a) Traffic Load (Erlang)
■ O ■ ■ All Learning
■ ■ Average With One deviation
■ H" No Learning
.........
........
.......j
2.5 3 3.
(b) Traffic Load (Erlang)
Fig. 8. (a) Energy Consumption (b) Utility in terms of energy consumption
• -
»•
»»
»»
»»
1.5
3.5
4
»»
the proposed model an average of 40% of energy is saved compared to when none of the users are learning.
Figure 8(b) shows the utility obtained in terms of energy consumption (UE) against traffic load. It can be seen that the average utility falls with the traffic load because as the traffic load increases the activity in the system increases and more collision occurs in the system. As expected when all the three players are learning, the average utility is significantly more than when a user is deviating especially
■B" No Learning ■O All T^ree Players Learning Average With One Deviation
..I»'"
.o......
.....V
Traffic Load (Eriang)
Fig. 9. The system delay with all three scenarios
as the traffic load increases. At lower traffic load, the users can avoid each other by transmitting on different channels, making the values closer at lower traffic loads compared to higher traffic loads. It can also be seen that with the proposed model there is an average of 20% increases in utility compared to when the learning process is not used.
Delay is one of the important parameters that determine the functionality of a wireless network. This is because different applications have different tolerance level for delay. Hence the delay experience by the players is also examined. Fig. 9 shows the delay against the traffic load when all the players are learning, when one of the players is deviating and when all the players are deviating. The delay increases as the traffic load increases for all the 3 scenarios because as the traffic load increases, the number of users entering the system also increase, thereby, increasing the delay. It can be seen that the delay in the system is lower when all the players are learning compared to when one player is deviating or all are deviating. There is an average of 33% reduction in delay using the proposed model for all traffic loads that was considered.
Another important performance metric in a wireless communication network is the blocking probability. Hence the blocking probability is examined to see if there is an improvement in the blocking probability of the system with the players learning. Fig. 10 shows the blocking probability of the system when all the three players are learning and the average blocking when one of the players is deviating from the learning model against the traffic load in the system. It can be seen that as the traffic load increases, the blocking also increases. This is because there is an increase in the system's collision. This result shows that learning reduces the blocking experienced by the users. Hence, the performance parameters are better with learning.
All the three players are contributing one way or the other to the performance of the system, hence the effects of the WSP not learning is examined. Fig. 11(a) shows the utility obtained by the WSP when learning and when using the greedy model. As expected, the utility obtained when learning is significantly higher than when not learning. This is because when the WSP is not learning, the reserve
■D ■ 1 With Learning
■1 Average Blocking with One Set of Player Using The Greedy Model
.....
1 -
2.5 3 3.5
Traffic Load (Erlang)
Fig. 10. The blocking probability for all three players learning and the average with one of the three players deviating from learning
0.7
0.6
0.5
0.4
0.3
0.2
0
D
0
.5
2
4
4.5
5
1
0.9 0.8
i 07 f 0.6
0.5
0.4
•"•-f-t-O
1 WSP Not Learning WSP Learning
.........—-—s !
...... ....................................
..............................*
WSP and One User Group Not Learning
WSP Not Learning and Both User Group Learning All Three Players Learning
121 ' ' '
2.5 3 3.5
(b) Traffic Load (Erlang)
0. 4
Fig. 11. (a) Utility against traffic load (a) WSP is learning at 3 Erlangs and WSP not learning (b) WSP and one of the users is not learning
price in the system is not set to reflect the present situation. Hence, the learning process does converge at a non-optimal value. This shows that it is important for the WSP to learn and use the reserve price to control the admission process. Fig. 11(b) shows the average utility obtained when the WSP and one of the users is not learning, when the WSP is learning but the other two players are not. For all three scenarios the utility obtained by the WSP increases. This is because as the traffic load increases, more of the available channels are in use. The results also show that the greater the number of players not learning, the lower the overall utility.
The results show that none of the players are better off or are having a higher utility value by deviating from the learning model. This shows that learning by all the three players forms a Nash Equilibrium for the proposed game model giving the definition of Nash equilibrium in [70].
VI. Conclusions and Future Work
This paper developed a learning scenario where all the users in the system can learn simultaneously. Different parameters were learnt by each of the users in the game model. Utility functions which were explicitly dependent on four parameters which determine the satisfaction received by the users was proposed. The utility function was based on the bid price, the green payments and the energy consumed by the user during the auction process. The results also showed that the energy consumed by the system is lower when all the users are learning the different parameters about each other compared to when of the player group is using the greedy model. As part of the future work a more mathematical model would be developed for the proposed system.
References
[1] Patil K., Prasad R. and Skouby K. A Survey of Worldwide Spectrum Occupancy Measurement Campaigns for Cognitive Radio, iDevices and Communications (ICDeCom), 2011 International Conference on, 2011, 1-5.
[2] Wang Z. and Salous S. Spectrum occupancy statistics and time series models for cognitive radio, Journal of Signal Processing Systems, 2011, 62, 145-155.
[3] Jinzhao S., Jianfei W. and Wei W. Dynamic spectrum allocation for heterogeneous cognitive radio networks from auction perspective, iCognitive Radio Oriented Wireless Networks and Communications (CROWNCOM), 2011 Sixth International ICST Conference on, 2011, 176-180.
[4] Yao L., Hao H., Jun W., and L. Shaoqian. Energy-efficient dynamic spectrum access using no-regret learning, Information, Communications and Signal Processing, 2009. ICICS 2009. 7th International Conference on, 2009, 1-5.
[5] Gu, x, G. r, Alago, x, and F. z, Green wireless communications via cognitive dimension: an overview, Network, IEEE, 2011, 25, 50-56.
[6] Zhu J. and Liu K.J.R. Cognitive radios for dynamic spectrum access - Dynamic Spectrum Sharing: A Game Theoretical Overview, Communications Magazine, IEEE, 2007, 45, 88-94.
[7] Iosifidis G. and Koutsopoulos I. Challenges in auction theory driven spectrum management, Communications Magazine, IEEE, 2011, 49, 128-135.
[8] Grace A.O. a. D. Energy Efficient Dynamic Spectrum Pricing for Cognitive Radio Based Cellular Systems Using The Concept of Green Payments, Paper under review. Submitted to Journal of Wireless and Personal Communications on 21st November 2014, 2014.
[9] Chunchun W., Sheng Z., and Guihai C. A strategy-proof spectrum auction for balancing revenue and fairness, Consumer Communications and Networking Conference (CCNC), 2014 IEEE 11th, 2014, 827-832.
[10] Kelly F.P., Maulloo A.K. and Tan D.K. Rate control for communication networks: shadow prices, proportional fairness and stability, Journal of the Operational Research society, 1998, 49, 237-252.
[11] Sengupta S. and Chatterjee M. An Economic Framework for Dynamic Spectrum Access and Service Pricing, Networking, IEEE/ACM Transactions on, 2009, 17, 1200-1213.
[12] Haitao L., Chatterjee M., Das S.K. and Basu K. ARC: an integrated admission and rate control framework for competitive wireless CDMA data networks using noncooperative games, Mobile Computing, IEEE Transactions on, 2005, 4, 243-258.
[13] Marbach P. Pricing differentiated services networks: bursty traffic, INFOCOM2001. Twentieth Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings. IEEE, 2001, 2, 650-658.
[14] Moore H.L. Empirical laws of demand and supply and the flexibility of prices, Political Science Quarterly, 1919, 34, 546-567.
[15] Oloyede A. and Grace D. Energy Efficient Bid Learning Process in an Auction Based Cognitive Radio Networks, Paper accepted in Bayero Univeristy Journal of Engineering and Technology(BJET), 2016/02/02 2016.
[16] Nan F., Siun-Chuon M. and Mandayam N.B. Pricing and power control for joint network-centric and user-centric radio resource management, Communications, IEEE Transactions on, 2004, 52, 1547-1557.
[17] Oloyede A. and Grace D. Energy efficient learning based auction process for cognitive radio systems, Consumer Communications and Networking Conference (CCNC), 2014 IEEE 11th, 2014, 35-40.
[18] Zhu H., Rong Z., and Poor H.V. Repeated Auctions with Bayesian Nonparametric Learning for Spectrum Access in Cognitive Radio Networks, Wireless Communications, IEEE Transactions on, 2011, 10, 890-900.
[19] Oloyede A. and Grace D. Energy Efficient Soft Real Time Spectrum Auction for Dynamic Spectrum Access, presented at the 20th International Conference on Telecommunications Casablanca, 2013.
[20] Youping Z., Shiwen M., J. Neel O. and Reed J.H. Performance Evaluation of Cognitive Radios: Metrics, Utility Functions, and Methodology, Proceedings of the IEEE, 2009 , 97, 642-659.
[21] Oloyede A. and Dainkeh A. Energy efficient soft real-time spectrum auction, Advances in Wireless and Optical Communications (RTUWO), 2015, 113-118.
[22] Kyosti P., Meinila J., Hentila L., Zhao X., Jamsa T., Schneider C. et al. IST-4-027756 WINNER II D1.1.2 V1.2 WINNER II Channel Models, 2007. Access: http://www.cept.org/files/1050/ documents/winner2%20-%20final%20report.pdf
[23] Burr A., Papadogiannis A., and Jiang T. MIMO Truncated Shannon Bound for system level capacity evaluation of wireless networks, Wireless Communications and Networking Conference Workshops (WCNCW), 2012 IEEE, 2012, 268-272.