Messages from 157825

Article: 157825
Subject: Re: Intel in Talks to buy Altera
From: rickman <gnuarm@gmail.com>
Date: Thu, 02 Apr 2015 20:15:27 -0400
Links: << >> << T >> << A >>

On 4/2/2015 6:52 PM, already5chosen@yahoo.com wrote:
> On Friday, April 3, 2015 at 1:03:23 AM UTC+3, rickman wrote:
>> On 4/2/2015 5:30 PM, already5chosen@yahoo.com wrote:
>>> On Thursday, April 2, 2015 at 2:28:33 AM UTC+3, Rob Gaddi wrote:
>>>> On Wed, 01 Apr 2015 19:10:55 -0400, rickman wrote:
>>>>
>>>>> On 4/1/2015 1:27 PM, John Speth wrote:
>>>>
>>>>>> I've used both example products with great success.  As you said, it's
>>>>>> real convenient to roll your own peripherals with impunity.  It saved
>>>>>> me hours of coding effort when you can smartly implement the peripheral
>>>>>> of your dreams with a little HW design.
>>>>>
>>>>> The part that gets me about the newer versions of this theme is that
>>>>> they are large, pricey FPGAs and incorporate fairly high end CPUs which
>>>>> are typically programmed under Linux...  a very far cry from the
>>>>> efficient solution I would like to see.  There are few engineers who can
>>>>> even design the entire system on that chip spanning logic design and
>>>>> system programming.
>>>>
>>>> Agreed.  We're looking hard at both Zynq and the Cyclone V SOC, both of
>>>> which have big monster Cortex A9s meant to run Linux with a mess of DRAM
>>>> and etc.  Which, I mean we can make work.  But if I could get a 10-20KLUT
>>>> FPGA with a dual or quad Cortex M4 instead?  Nice and light with every
>>>> intention of running bare metal with 10-20K of code?  I'd take it in a
>>>> heartbeat.
>>>>
>>>> --
>>>> Rob Gaddi, Highland Technology -- www.highlandtechnology.com
>>>> Email address domain is currently out of order.  See above to fix.
>>>
>>> Cortex-M is most useful when you have good chunk of flash on the same die. Which, unfortunately, would be incompatible with silicon tech used for moderm FPGAs.
>>
>> That is a bit of nonsense unless you consider Lattice and MicroSemi to
>> not be using "modern" FPGA processes.  They include Flash in their
>> devices for the configuration memory.
>>
>
> Well, you are right, I am not familiar with Lattice and MicroSemi. From the little I know about them their FPGA are modern in a sense that they a new products and, may be, modern in specific system-level features, but when it comes to size and performance of the fabric, including such important to some of us characteristic as dynamic power per watt (static power is probably o.k) they are at least 5 years behind X&A, but probably more than 5.


Sometimes I get really tired of of Thunderbird.  When I reply to a post 
the quoted text is just as likely as not to extend beyond the margin and 
off the screen... a bloody nuisance I tell you.

Anyway, I think you are not familiar at all with the Lattice products. 
They have lines of FPGAs that are RAM based like the X and A parts and 
are likely a generation behind in many terms... but you shouldn't focus 
on the things that are intangible to you.  Do you really care what 
geometry a part is made in?  No, you care whether your design will fit, 
if it will run fast enough and how much the part costs.  I think unless 
you need the largest parts in the X or A line the L parts will do the 
job competitively.

I would also mention that you can thank L for the availability of SERDES 
in lower cost FPGAs.  Lattice was the first to offer that and X and A 
only followed begrudgingly I think.

The other parts, like the XOx and XPx lines can't be compared to 
anything X or A makes (unless they've come out with something in the 
last 6 months) because X and A have shied away from the Flash based 
market.  They are adequately fast and have brought the price down to a 
point where they are competitive against MCUs in some apps.

So saying L is 5 years behind is probably no accurate and not very useful.


>>> DACs and SAR ADCs are also problem. Delta-sigma ADCs probably less so, but I am not an expert. Anyway, for apps that I acre about SAR is more useful than delta-sigma.
>>
>> Once again you should tell that to MicroSemi... They make a mixed signal
>> FPGA with CPU, analog and FPGA on one die.  I don't use it because of
>> the price, a bit higher than I like to see.
>>
>>
>>> Due to all these factors small embedded solution based on Cortex-M integrated into FPGA is likely to and up more complex, using more chips and more expensive than solution based on Cortex-M (or even Cortex-R) MCU + FPGA.
>>
>> I think you are saying that an FPGA with internal MCU is not as useful
>> as separate FPGA and MCU because the MCU will have lots of other stuff
>> integrated that would be additional chips with the integrated approach.
>
> Yes, NOR flash, ADCs, DACs.
>
>> Clearly it doesn't have to be that ways since at least one company
>> makes such parts.
>>
>>
>>> The ugly part about MCU + FPGA solution is that, unlike chips from the past, small modern Cortex-M MCUs rarely have good bus to talk to FPGA  (good=simple, not to slow and not too many pins). But then again, those old 32-bit MCUs that had buses that I liked were in $25+ price range. For fair comparison I probably have to look at old 8-bitter that I never even tried to connect to FPGA.
>>
>> That brings us back to the real differences between the MCU world and
>> the typical FPGA world.  MCUs are intended for apps where speed is
>> limited by the software.  FPGAs are intended for apps where speed is
>> potentially much faster with the limitation potentially in the I/O.  So
>> a typical high end FPGA will have lots of I/O and some very fast I/O.
>>
>
> That about right, except that I am not talking about high-end FPGAs, but about modern "low-cost" lines of A&X. So, fast I/O optional and very fast I/O is rarely even an option (fast=1-3.125 Gbit/s, very fast= >3.125).
> But for MCU<->FPGA interface I will be mostly satisfied in much more moderate speed. Say, something logically similar to venerable LPC bus, but without 24-bit address space limit (28 bits probably acceptable) and with physical layer of RGMII.

"High" speed is relative.  Integrated MCUs would have direct bus mapped 
access to FPGA connections which clearly would run at full speed 
depending on your FPGA design.  Multi-die solutions would need to be bit 
banged I/O from the MCU or use some peripheral like SPI or Ethernet.  As 
you say, there ain't no buses on many MCUs anymore.


>> But such an integrated MCU/FPGA device would not be intended for high
>> end apps with Mbps I/O.  The FPGA would be adding special functionality
>> that perhaps can't be done in the MCU alone.  I had a design that
>> required exactly this sort of need and ended up having to use an FPGA
>> with an attached CODEC since there were no MCUs which could implement
>> one interface.  The FPGA was a bit jammed up in terms of capacity (only
>> 3 kLUT).  A small soft core could do most of the work and potentially
>> free up some space.  Had a combined chip been available it would have
>> been a breeze to implement the one interface in hardware (or maybe two)
>> and the rest of the design in software.
>>
>>
>>> Back to another reason why I think that hard ARM Cortex-M4 core in [Altera or Xilinx] FPGA does not look as a very good proposition:
>>> The added value of M3/M4 core alone, without flash and mixed-signal peripherals, is not that big. After all Nios2-f core (only core, without debug support and avalon-mm infrastructure around it) occupies only ~25% of the smallest Cyclone4 device or ~7% of the smallest Cyclone5-E and achieves comparable performance. As far as I am concerned, the main advantage of Cortex-M is a code density - significantly more code can fits on-chip. But even that is less important if were are talking about Cyclone5 generation, because here the smallest member has 140 KB of embedded memory (not counting MLABs), which is often enough.
>>
>> Yep, the low end MCU on an FPGA without any of the peripherals would not
>> be a lot more interesting than a soft core.
>
> Just a nitpick - by definition there is no such thing as "MCU without any of the peripherals". Let's call them "MCU-style hard cores" or just "ARM Cortex-M4" because this particular core looks like the most logical (or least illogical) candidate.

Not sure what that means, but whatever.  Potatoes, Patahtoes.


>> So when will they be doing
>> a better job of the Vulcan mind meld and getting more analog on the FPGA
>> die?  It's not like there is anything so special about FPGA logic that
>> can't be done in analog compatible processes.  Maybe you lose some
>> density or performance, but that isn't what we are after.  At least *I*
>> am looking for a system on chip which includes some FPGA fabric.  Don't
>> think of it as an FPGA with an MCU on chip.
>
> Yes, could be nice. But to be real useful FPGA part should not be too small.. I wouldn't bother for 1K 4-input LUTs. 5K looks like reasonable minimum, at least for gray haired devs like you and me. Younger guys a spoiled, they'd want more than that.

My current product is shipping in a 3 kLUT device.  I could have shoved 
a lot more functionality in if I had used a soft core (of my own design, 
the canned ones are too large).


>> Think of it as an MCU with
>> FPGA fabric on chip just like the other umpty-nine peripherals they
>> already have along with.... gasp!... 5 volt tolerance.  lol
>>
>
> Is 5-V tolerance really that useful [in new designs] without ability to actually drive 5V outputs? I suppose, even you don't expect the later in 2015 :-)

There are any number of MCUs that still have 5 volt I/Os.  A Cypress 
line that I was looking at can run with Vcc of 1.8-5 V.  Clearly there 
is a need for such parts or they wouldn't keep designing them.  The FPGA 
vendors ignore this segment because they have never wanted to go down 
the low price, high volume road in earnest.  They would love to get some 
automotive products designed in and 5 volt I/Os are popular there I 
believe.  I remember when the Xilinx folks were saying the next 
generation after Spartan 3 would not support 3.3 volt I/Os!  But then 
they also told me that if I connected the FPGA to the load with a 1 inch 
trace I could blow up the Spartan I/Os if I didn't simulate it.  Really? 
  They tend to see the world through FPGA glasses as if they drove the 
market rather than the market driving their designs.

-- 

Rick

Article: 157826
Subject: Re: Intel in Talks to buy Altera
From: HT-Lab <hans64@htminuslab.com>
Date: Sat, 04 Apr 2015 08:34:34 +0100
Links: << >> << T >> << A >>

On 27/03/2015 19:50, 1@FPGARelated wrote:
> http://www.wsj.com/articles/intel-in-talks-to-buy-altera-1427485172
> ---------------------------------------
> Posted through http://www.FPGARelated.com
>

For those that haven't seen it:

http://www.deepchip.com/items/0548-02.html

good article,

Hans
www.ht-lab.com

Article: 157827
Subject: Re: Intel in Talks to buy Altera
From: already5chosen@yahoo.com
Date: Sun, 5 Apr 2015 04:52:44 -0700 (PDT)
Links: << >> << T >> << A >>

On Saturday, April 4, 2015 at 10:34:43 AM UTC+3, HT-Lab wrote:
> On 27/03/2015 19:50, 1@FPGARelated wrote:
> > http://www.wsj.com/articles/intel-in-talks-to-buy-altera-1427485172
> > ---------------------------------------
> > Posted through http://www.FPGARelated.com
> >
>=20
> For those that haven't seen it:
>=20
> http://www.deepchip.com/items/0548-02.html
>=20
> good article,
>=20
> Hans
> www.ht-lab.com

I am even more pessimistic than John Cooley.
He lists "FPGA users" twice, both on the negative list (Instability in the =
overall FPGA market (like when two biggest players are in chaos) means R&D =
on the advanced FPGAs is cut; and the prices for current FPGAs go up. (It's=
 economics.)) and on the positive list ("No more Xilinx-Altera duopoly in F=
PGA's!").
I personally don't see in which aspects lame duopoly can be better for me, =
FPGA user, than functional duopoly. Yes, potentially some parts could becom=
e slightly cheaper, but that's nothing relatively to impact of instability =
on development process.
Besides, it seems, Cooley underestimates ability of Intel to destroy good, =
solid companies that they are acquiring.

Article: 157828
Subject: Does each core of 8-core Intel processor has an independent floating
From: Weng Tianxiang <wtxwtx@gmail.com>
Date: Sun, 5 Apr 2015 06:43:23 -0700 (PDT)
Links: << >> << T >> << A >>

Hi, 
Does each core of 8-core Intel processor has an independent floating X87 unit? 

Here are some texts from Intel latest datasheet:

Intel(R) Core(tm) i7 Processor Family for LGA2011-v3 Socket
Datasheet - Volume 1 of 2

Processor Feature Details
*Up to 8 execution cores
*Each core supports two threads (Intel(R) Hyper-Threading Technology)
*32 KB instruction and 32 KB data first-level cache (L1) for each core
*256 KB shared instruction/data mid-level (L2) cache for each core
*Up to 15 MB last level cache (LLC): up to 2.5 MB per core instruction/data last level cache (LLC), shared among all cores.

5.2 X87 FPU INSTRUCTIONS
The x87 FPU instructions are executed by the processor's x87 FPU. These instructions operate on floating-point, integer, and binary-coded decimal (BCD) operands. For more detail on x87 FPU instructions, see Chapter 8, "Programming with the x87 FPU."

These instructions are divided into the following subgroups: data transfer, load constants, and FPU control instructions.

From above text I have a feeling that all 8 execution cores share the same X87 FPU unit.

Am I right or not? 

Is there anyone who has real experiences with X87 FPU unit?

Thank you. 

Weng

Article: 157829
Subject: Question about summation function
From: "Fess" <1@FPGARelated>
Date: Mon, 06 Apr 2015 10:30:05 -0500
Links: << >> << T >> << A >>

Hello! I've using VHDL for like 2 years or even more but just today i
wondeк  how it works. Summation function form any package, std_arith
for example operates on two arguments. But this one returns
std_logic_vector as a result. So i have no ideas how it works when you
are using something like this:

signal result : std_logic_vector (15 downto 0);
signal arg_1: std_logic_vector (15 downto 0);
signal arg_2: std_logic_vector (15 downto 0);
signal arg_3: std_logic_vector (15 downto 0);

result  <= signed(arg_1) + signed(arg_2) + signed(arg_3);

Any of used summation returns std_logic_vector but then it needs to use
one more summation but it's 
undefined for std_logic vector and signed.

It would be great if someone would clarify. Thanks

P.S. sorry for my English(



---------------------------------------
Posted through http://www.FPGARelated.com

Article: 157830
Subject: Question about summation function
From: KJ <kkjennings@sbcglobal.net>
Date: Mon, 6 Apr 2015 15:08:38 -0700 (PDT)
Links: << >> << T >> << A >>

Adding two signed will produce a signed, not std_logic_vector.

Kevin Jennings

Article: 157831
Subject: Microblaze with AXI streaming interfaces
From: "pad007" <104923@FPGARelated>
Date: Tue, 07 Apr 2015 14:11:14 -0500
Links: << >> << T >> << A >>

Hello,
         I am trying to connect my IP to the microblaze by AXI streaming
protocol. I have connected my IP to the microblaze using the AXI
streaming link M0_AXIS in XPS. But it seems that the microblaze does not
accept any data ie., the s_axis_tready never goes high. The below is the
code for micro blaze.

while(1){

     print("waiting for a packet...n");
      getfslx(temp,0,FSL_NONBLOCKING);
      temp2=temp;
      print(temp);
      print("getfsl passedn");
      putfslx(temp2,0,FSL_NONBLOCKING);
      print("putfsl passedn");
     print(temp2);
}


---------------------------------------
Posted through http://www.FPGARelated.com

Article: 157832
Subject: Division by a constant
From: Rob Gaddi <rgaddi@technologyhighland.invalid>
Date: Thu, 9 Apr 2015 16:57:43 +0000 (UTC)
Links: << >> << T >> << A >>

So I just had a thought.  Most synthesis tools (in VHDL, and I assume in 
Verilog) will allow you to use the division operator to perform 
truncating division by a constant in synthesizable code, so long as that 
constant is a power of 2.

That seems like a reasonable restriction; that you can only divide when 
it's just a right shift, right up until you think a bit longer.  Because 
I do synthesizable division by a constant all the time, actually, as 
multiplication by the reciprocal.  So I wind up writing things like

y := x * (2**17 / 3) / 2**17.

It obscures the logic a bit, but works.  But I was thinking, and not only 
does it obscure the logic, but it forces assumptions into my code about 
what the underlying multiplier block looks like.  Why 2**17?  Because I'm 
assuming a 18 bit signed multiplier, because that's what happens to be on 
some architecture (Altera Cyclone4 if I remember right).

It seems trivial for the synthesizer to do that transformation, division 
by a constant => multiplication by the reciprocal, in a way that is 
optimized for the underlying hardware.  Any non-braindamaged C compiler 
will do it without being asked.  And maybe the synth tools do, it's just 
been forever since I've actually checked.

Has anyone looked at this in a while?  Are any of the synth tools smart 
enough to handle this on their own these days?

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 157833
Subject: Re: Division by a constant
From: Thomas Stanka <usenet_nospam_valid@stanka-web.de>
Date: Fri, 10 Apr 2015 04:26:26 -0700 (PDT)
Links: << >> << T >> << A >>

Am Donnerstag, 9. April 2015 18:58:41 UTC+2 schrieb Rob Gaddi:
> That seems like a reasonable restriction; that you can only divide when=
=20
> it's just a right shift, right up until you think a bit longer.  Because=
=20
> I do synthesizable division by a constant all the time, actually, as=20
> multiplication by the reciprocal.  So I wind up writing things like
>=20
> y :=3D x * (2**17 / 3) / 2**17.

There are a few nontrivial problems when using division in digital logic th=
at also apply for constant operation.

Multiplicate with the reziprocal value is a valid function for real number =
(mathematic term), but not with fixed point (unsigned with shifted decimal)=
. If a number is 2^^n, its reciprog is well defined for fix point. For 3 yo=
u will find no exact reciproc with fixpoint notation, regardless of the num=
ber of digits you spend after fixpoint.

If you use floating point your chances of correct calculation result are fa=
r better, but you are not safe.=20
The term "(a-b)*c" is not always equivalent to "a*c - b*c" even in floating=
 point arithmetic with a and b beeing significant different in size.

best regards

Thomas

Article: 157834
Subject: Re: Division by a constant
From: "kaz" <37480@FPGARelated>
Date: Fri, 10 Apr 2015 07:30:32 -0500
Links: << >> << T >> << A >>

if you multiply by fraction e.g y = x*0.333... then you are just leading
the compiler to avoid division.

I don't use fractions but new libraries support them.

Kaz
---------------------------------------
Posted through http://www.FPGARelated.com

Article: 157835
Subject: Re: Division by a constant
From: Allan Herriman <allanherriman@hotmail.com>
Date: 10 Apr 2015 13:11:06 GMT
Links: << >> << T >> << A >>

On Thu, 09 Apr 2015 16:57:43 +0000, Rob Gaddi wrote:

> So I just had a thought.  Most synthesis tools (in VHDL, and I assume in
> Verilog) will allow you to use the division operator to perform
> truncating division by a constant in synthesizable code, so long as that
> constant is a power of 2.
> 
> That seems like a reasonable restriction; that you can only divide when
> it's just a right shift, right up until you think a bit longer.  Because
> I do synthesizable division by a constant all the time, actually, as
> multiplication by the reciprocal.  So I wind up writing things like
> 
> y := x * (2**17 / 3) / 2**17.
> 
> It obscures the logic a bit, but works.  But I was thinking, and not
> only does it obscure the logic, but it forces assumptions into my code
> about what the underlying multiplier block looks like.  Why 2**17? 
> Because I'm assuming a 18 bit signed multiplier, because that's what
> happens to be on some architecture (Altera Cyclone4 if I remember
> right).
> 
> It seems trivial for the synthesizer to do that transformation, division
> by a constant => multiplication by the reciprocal, in a way that is
> optimized for the underlying hardware.  Any non-braindamaged C compiler
> will do it without being asked.  And maybe the synth tools do, it's just
> been forever since I've actually checked.
> 
> Has anyone looked at this in a while?  Are any of the synth tools smart
> enough to handle this on their own these days?

Hi Rob,

I'm still not sure I'd trust a synthesiser to handle that sort of
thing portably.

I don't think I've ever actually used a multiplier or divider in
a synthesisable design.  There always seems to be some way to avoid
them, even for DSP, usually by employing smarter design at the system
level and careful selection of scaling factors, filter coefficients,
etc.
(I don't do the sort of filtering that needs
extremely precise coefficients.  YMMV.)

When trying to avoid the use of the hard multipliers,
I would consider employing tricks like Booth recoding:
http://en.wikipedia.org/wiki/Booth%27s_multiplication_algorithm#How_it_works
which can sometimes help with a fixed multiplicand that has
a long string of 1s.

I would also look for repeating patterns in a fixed multiplicand.
Repeating patterns often arise when taking the reciprocal of a constant,
e.g. 1/5 = (binary) 0.00110011001100110011001100 etc.
This is equal to 11 * 10001 * 100000001 * ... (shifted right)
and a small number of adders can produce this result to any
desired precision.

Both techniques ought to be amenable to automation in a
synthesiser.  But the synth tool I use doesn't even
support VHDL 2008 yet (thanks Xilinx!) so I won't hold
my breath waiting for comprehensive tool support for
multiplication other than the basic/obvious use of the
built-in hard blocks.

Regards,
Allan

Article: 157836
Subject: Aurora IP 8B10B problem with TVALID
From: "ponnagantiraju" <104867@FPGARelated>
Date: Fri, 10 Apr 2015 10:17:16 -0500
Links: << >> << T >> << A >>

I am getting unexpected SOF(AXI_OP_TVALID) signal down as shown in the
figure (find the fig in attachments). I have taken  example design as a
reference. In the dsign, I fixed frame size (X"07"). But in Rx, SOF is
getting down near  EOF( as in fig).I want to receive complete frame data
beat.In the waveform,SOF means VALID, EOF means TLAST.I am trying to get
complete data from SOF to EOF as TX side.Can anybody give solution for
the problem? 
Deatails Aurora IP 8B10B ,v8.3,Xilinx ISE 14.7

For image 

http://forums.xilinx.com/t5/New-Users-Forum/Aurora-8b10b-problem-with-M-AXI-RX-TVALID-SOF/td-p/588858
 
Regards,
Raju


---------------------------------------
Posted through http://www.FPGARelated.com

Article: 157837
Subject: Re: Division by a constant
From: Rob Gaddi <rgaddi@technologyhighland.invalid>
Date: Fri, 10 Apr 2015 17:22:38 +0000 (UTC)
Links: << >> << T >> << A >>

On Fri, 10 Apr 2015 04:26:26 -0700, Thomas Stanka wrote:

> Multiplicate with the reziprocal value is a valid function for real
> number (mathematic term), but not with fixed point (unsigned with
> shifted decimal). If a number is 2^^n, its reciprog is well defined for
> fix point. For 3 you will find no exact reciproc with fixpoint notation,
> regardless of the number of digits you spend after fixpoint.
> 

Yes and no.  I believe one can prove (for values of one excluding me) 
that for a bounded integer numerator, you can always define a reciprocal 
multiply that will give the exact same result as "floor division" for all 
numerators in those bounds.  Those differences you're talking about due 
to integers being "unreal" numbers are all pushed down into the 
remainder.  And the same should therefore hold for fixed_point, since 
you're always looking for the quotient to be in some finite format past 
which you don't care about the errors.

I did a quick Python script just to test with integers.  It tests the 
dumb way, through complete exhaustion of the input set, but my arbitrary 
poking about sure implies that a) you can always find an answer and b) 
that answer will require no more than 1 bit more than the numerator.

#!/usr/bin/env python3

"""
Test the theory that, given a bounded numerator, there is a reciprocal 
multiply
that will always give the same result as floor division.
"""

import numpy as np

nbits = 22
divide_by = 65537

# Proof through exhaustion, create all possible numerators
numerators = np.arange(2**nbits, dtype=int)
quotients = numerators // divide_by

class FoundAnswer(Exception): pass

try:
  expected_dbits = nbits + divide_by.bit_length()
  for dbits in range(expected_dbits-1, expected_dbits+2):
    basic_recip = (2**dbits) // divide_by
    for recip in range(basic_recip, basic_recip + 2):
      approx_quotients = (numerators * recip) >> dbits
      if np.all(approx_quotients == quotients):
        print(
          'For all {nbits} bit numerators N//{div} == N*{recip}>>
{dbits}'.format(
            nbits = nbits, div = divide_by, recip=recip, dbits=dbits
        ))
        print('{recip} requires {b} bits.'.format(recip=recip, 
b=recip.bit_length()))
        raise FoundAnswer()
except FoundAnswer:
  pass

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 157838
Subject: does anybody use systemc in FPGA flow?
From: "pini_kr" <93490@FPGARelated>
Date: Sat, 11 Apr 2015 01:16:37 -0500
Links: << >> << T >> << A >>

Hi

I just wanted to know if people use systemc in FPGA flow. systemc can be
used for cycle accurate simulation, where it can replace RTL. In this
mode test-benches will usually  take advantage of c++ and SCV (for
writing constraints).
For big designs where RTL completion takes a lot of time systemc can be
used for LT or AT simulations ( Loosely Timed, Approximately Timed 
TLM).

Pini
---------------------------------------
Posted through http://www.FPGARelated.com

Article: 157839
Subject: Re: does anybody use systemc in FPGA flow?
From: Tim Wescott <seemywebsite@myfooter.really>
Date: Sun, 12 Apr 2015 01:33:46 -0500
Links: << >> << T >> << A >>

On Sat, 11 Apr 2015 01:16:37 -0500, pini_kr wrote:

> Hi
> 
> I just wanted to know if people use systemc in FPGA flow. systemc can be
> used for cycle accurate simulation, where it can replace RTL. In this
> mode test-benches will usually  take advantage of c++ and SCV (for
> writing constraints).
> For big designs where RTL completion takes a lot of time systemc can be
> used for LT or AT simulations ( Loosely Timed, Approximately Timed TLM).

Are you asking a question or pushing an advertisement?  I doubt there are 
many on the group who don't know what SystemC is.

Proper capitalization may help -- "systemc" looks like a misspelling of 
"systemic".  SystemC looks like -- well, SystemC.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Article: 157840
Subject: Re: does anybody use systemc in FPGA flow?
From: Alan Fitch <apf@invalid.invalid>
Date: Sun, 12 Apr 2015 22:32:31 +0100
Links: << >> << T >> << A >>

On 12/04/15 07:33, Tim Wescott wrote:
> On Sat, 11 Apr 2015 01:16:37 -0500, pini_kr wrote:
> 
>> Hi
>>
>> I just wanted to know if people use systemc in FPGA flow. systemc can be
>> used for cycle accurate simulation, where it can replace RTL. In this
>> mode test-benches will usually  take advantage of c++ and SCV (for
>> writing constraints).
>> For big designs where RTL completion takes a lot of time systemc can be
>> used for LT or AT simulations ( Loosely Timed, Approximately Timed TLM).
> 
> Are you asking a question or pushing an advertisement?  I doubt there are 
> many on the group who don't know what SystemC is.
> 
> Proper capitalization may help -- "systemc" looks like a misspelling of 
> "systemic".  SystemC looks like -- well, SystemC.
> 

The OP might be interested in this:

http://www.testandverification.com/conferences/verification-futures/2015-europe/speaker-andy-lunness-bluwireless-technology/

It's a verification paper, but the design flow was also in SystemC,

regards
Alan

-- 
Alan Fitch

Article: 157841
Subject: Re: does anybody use systemc in FPGA flow?
From: Sharad <sharad.snh@gmail.com>
Date: Mon, 13 Apr 2015 06:32:39 -0700 (PDT)
Links: << >> << T >> << A >>

On Saturday, April 11, 2015 at 2:16:41 PM UTC+8, pini_kr wrote:
> Hi
>=20
> I just wanted to know if people use systemc in FPGA flow. systemc can be
> used for cycle accurate simulation, where it can replace RTL. In this
> mode test-benches will usually  take advantage of c++ and SCV (for
> writing constraints).
> For big designs where RTL completion takes a lot of time systemc can be
> used for LT or AT simulations ( Loosely Timed, Approximately Timed=20
> TLM).
>=20
> Pini
> ---------------------------------------
> Posted through http://www.FPGARelated.com

I am wondering what has SystemC got to do with FPGA design flow. If you are=
 asking about support for synthesizable subset of SystemC , Xilinx and Alte=
ra do not support it in their FPGA flows (Quartus, ISE and Vivado). But it =
is supported in Vivado HLS. If you use HLS for FPGA designs, then yes, you =
can use SystemC directly. Of course, even outside of HLS, one can use Syste=
mC...depends on how many different models of our design we want to make.

Article: 157842
Subject: Re: Division by a constant
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Thu, 16 Apr 2015 23:01:16 +0000 (UTC)
Links: << >> << T >> << A >>

Rob Gaddi <rgaddi@technologyhighland.invalid> wrote:
> On Fri, 10 Apr 2015 04:26:26 -0700, Thomas Stanka wrote:

>> Multiplicate with the reziprocal value is a valid function for real
>> number (mathematic term), but not with fixed point (unsigned with
>> shifted decimal). 

(snip)
> Yes and no.  I believe one can prove (for values of one excluding me) 
> that for a bounded integer numerator, you can always define a reciprocal 
> multiply that will give the exact same result as "floor division" for all 
> numerators in those bounds.  Those differences you're talking about due 
> to integers being "unreal" numbers are all pushed down into the 
> remainder.  And the same should therefore hold for fixed_point, since 
> you're always looking for the quotient to be in some finite format past 
> which you don't care about the errors.

With the multiply instruction on many processors, that generates
a double length signed product, I believe that for many constants,
maybe half of them, there is a multiplier that will generate the
appropriate truncated quotient in the high half of the product.

But in the case the OP asked, it isn't so obvious that it
should do that. 

Another choice would be a primitive that would generate the
appropriate multiplier.

Often when you want a divider, you want it pipelined, which
is unlikely to be synthesized.

-- glen

Article: 157843
Subject: Re: Division by a constant
From: Rob Gaddi <rgaddi@technologyhighland.invalid>
Date: Thu, 16 Apr 2015 23:23:57 +0000 (UTC)
Links: << >> << T >> << A >>

On Thu, 16 Apr 2015 23:01:16 +0000, glen herrmannsfeldt wrote:

> Rob Gaddi <rgaddi@technologyhighland.invalid> wrote:
>> On Fri, 10 Apr 2015 04:26:26 -0700, Thomas Stanka wrote:
>  
>>> Multiplicate with the reziprocal value is a valid function for real
>>> number (mathematic term), but not with fixed point (unsigned with
>>> shifted decimal).
> 
> (snip)
>> Yes and no.  I believe one can prove (for values of one excluding me)
>> that for a bounded integer numerator, you can always define a
>> reciprocal multiply that will give the exact same result as "floor
>> division" for all numerators in those bounds.  Those differences you're
>> talking about due to integers being "unreal" numbers are all pushed
>> down into the remainder.  And the same should therefore hold for
>> fixed_point, since you're always looking for the quotient to be in some
>> finite format past which you don't care about the errors.
> 
> With the multiply instruction on many processors, that generates a
> double length signed product, I believe that for many constants,
> maybe half of them, there is a multiplier that will generate the
> appropriate truncated quotient in the high half of the product.
> 

Right, and I've never seen a multiplier block in an FPGA that doesn't do 
the same.  For a while they were 18x18=>36, then I started hitting 
18*25=>43, but regardless, the hard multiplier blocks always generate 
P'length = A'length + B'length.

> But in the case the OP asked, it isn't so obvious that it should do
> that.
> 
> Another choice would be a primitive that would generate the appropriate
> multiplier.

Ugh, but then you'd have to instantiate it as a separate block and wire 
it in.  That's even uglier than having to put the bit-shifting logic in 
manually.

> 
> Often when you want a divider, you want it pipelined, which is unlikely
> to be synthesized.
> 

Not really.  Often when I want a divider I have to pipeline it because 
the division algorithm is inherently serial.  But I don't _want_ it to be 
pipelined, that's just the only choice I've got for a true X/Y divide.

But for X/K with constant K, it can (in every case I've seen) be 
implemented with a multiplier block, or simply by wire if K is a power of 
2.  That gets us done in a single cycle.

Sure that multiplier block may blow up into horrible cross-multiplies 
spanning multiple blocks if I ask for a stupidly large K.  But the same 
can be said for X*K, and the tool lets me request that just fine, and if 
I ask for a stupid K I get a mess of logic that either a) only meets 
timing with a slow clock or b) requires me to make a few stages of 
register pushback available.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 157844
Subject: Re: Division by a constant
From: rickman <gnuarm@gmail.com>
Date: Thu, 16 Apr 2015 20:00:45 -0400
Links: << >> << T >> << A >>

On 4/16/2015 7:23 PM, Rob Gaddi wrote:
> On Thu, 16 Apr 2015 23:01:16 +0000, glen herrmannsfeldt wrote:
>
>> Rob Gaddi <rgaddi@technologyhighland.invalid> wrote:
>>> On Fri, 10 Apr 2015 04:26:26 -0700, Thomas Stanka wrote:
>>
>>>> Multiplicate with the reziprocal value is a valid function for real
>>>> number (mathematic term), but not with fixed point (unsigned with
>>>> shifted decimal).
>>
>> (snip)
>>> Yes and no.  I believe one can prove (for values of one excluding me)
>>> that for a bounded integer numerator, you can always define a
>>> reciprocal multiply that will give the exact same result as "floor
>>> division" for all numerators in those bounds.  Those differences you're
>>> talking about due to integers being "unreal" numbers are all pushed
>>> down into the remainder.  And the same should therefore hold for
>>> fixed_point, since you're always looking for the quotient to be in some
>>> finite format past which you don't care about the errors.
>>
>> With the multiply instruction on many processors, that generates a
>> double length signed product, I believe that for many constants,
>> maybe half of them, there is a multiplier that will generate the
>> appropriate truncated quotient in the high half of the product.
>>
>
> Right, and I've never seen a multiplier block in an FPGA that doesn't do
> the same.  For a while they were 18x18=>36, then I started hitting
> 18*25=>43, but regardless, the hard multiplier blocks always generate
> P'length = A'length + B'length.
>
>> But in the case the OP asked, it isn't so obvious that it should do
>> that.
>>
>> Another choice would be a primitive that would generate the appropriate
>> multiplier.
>
> Ugh, but then you'd have to instantiate it as a separate block and wire
> it in.  That's even uglier than having to put the bit-shifting logic in
> manually.
>
>>
>> Often when you want a divider, you want it pipelined, which is unlikely
>> to be synthesized.
>>
>
> Not really.  Often when I want a divider I have to pipeline it because
> the division algorithm is inherently serial.  But I don't _want_ it to be
> pipelined, that's just the only choice I've got for a true X/Y divide.

Maybe I am missing something, but it doesn't need to be pipelined.  Just 
take out the registers and it's no longer pipelined.


> But for X/K with constant K, it can (in every case I've seen) be
> implemented with a multiplier block, or simply by wire if K is a power of
> 2.  That gets us done in a single cycle.
>
> Sure that multiplier block may blow up into horrible cross-multiplies
> spanning multiple blocks if I ask for a stupidly large K.  But the same
> can be said for X*K, and the tool lets me request that just fine, and if
> I ask for a stupid K I get a mess of logic that either a) only meets
> timing with a slow clock or b) requires me to make a few stages of
> register pushback available.

I don't think a large K will create any problems that require more 
multiplier blocks.  The resolution required only depends on... the 
resolution required.  If you are working with truncated integers it 
doesn't matter if the divisor is large.  That just means you get smaller 
results, not more math to do.  Or are you thinking you can save logic by 
using small values of K which can reduce the logic required?  If using a 
block multiplier you can't use less than one... or can you?  Hmmm...

-- 

Rick

Article: 157845
Subject: Re: Division by a constant
From: Rob Gaddi <rgaddi@technologyhighland.invalid>
Date: Fri, 17 Apr 2015 00:27:52 +0000 (UTC)
Links: << >> << T >> << A >>

On Thu, 16 Apr 2015 20:00:45 -0400, rickman wrote:

> On 4/16/2015 7:23 PM, Rob Gaddi wrote:
> 
>> But for X/K with constant K, it can (in every case I've seen) be
>> implemented with a multiplier block, or simply by wire if K is a power
>> of 2.  That gets us done in a single cycle.
>>
>> Sure that multiplier block may blow up into horrible cross-multiplies
>> spanning multiple blocks if I ask for a stupidly large K.  But the same
>> can be said for X*K, and the tool lets me request that just fine, and
>> if I ask for a stupid K I get a mess of logic that either a) only meets
>> timing with a slow clock or b) requires me to make a few stages of
>> register pushback available.
> 
> I don't think a large K will create any problems that require more
> multiplier blocks.  The resolution required only depends on... the
> resolution required.  If you are working with truncated integers it
> doesn't matter if the divisor is large.  That just means you get smaller
> results, not more math to do.  Or are you thinking you can save logic by
> using small values of K which can reduce the logic required?  If using a
> block multiplier you can't use less than one... or can you?  Hmmm...

There are probably some trivial cases where the divide by N reduces to a 
multiply by something silly like 3 followed by a bit-shift that might get 
implemented on fabric, but I'd tend to assume that any time I did a 
divide, like anytime I did a multiply, I'm likely to commit a multiplier 
to it.  If I get lucky, all the better.

If it takes a lot of bits to accurately represent K, it takes a lot of 
bits to accurately represent 1/K, subject to many of the same caveats 
about factoring out powers of 2.

Likewise, if I tell the tools that I want to use a 32-bit numerator, 
that'll take cross-multiplies too.  But all that already gets handled 
correctly in the multiplication case.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.

Article: 157846
Subject: Re: Division by a constant
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Date: Fri, 17 Apr 2015 00:43:29 +0000 (UTC)
Links: << >> << T >> << A >>

Rob Gaddi <rgaddi@technologyhighland.invalid> wrote:

(snip, I wrote)
>> With the multiply instruction on many processors, that generates a
>> double length signed product, I believe that for many constants,
>> maybe half of them, there is a multiplier that will generate the
>> appropriate truncated quotient in the high half of the product.
 
> Right, and I've never seen a multiplier block in an FPGA that 
> doesn't do the same.  For a while they were 18x18=>36, then 
> I started hitting 18*25=>43, but regardless, the hard 
> multiplier blocks always generate 
> P'length = A'length + B'length.
 
>> But in the case the OP asked, it isn't so obvious that it should do
>> that.
 
>> Another choice would be a primitive that would generate the appropriate
>> multiplier.
 
> Ugh, but then you'd have to instantiate it as a separate block and wire 
> it in.  That's even uglier than having to put the bit-shifting logic in 
> manually.

Yes it is ugh, but you know that you are asking for one. 
 
>> Often when you want a divider, you want it pipelined, 
>> which is unlikely to be synthesized.
 
> Not really.  Often when I want a divider I have to pipeline it because 
> the division algorithm is inherently serial.  But I don't _want_ it to be 
> pipelined, that's just the only choice I've got for a true X/Y divide.

Well, I usually go the FPGA route when I want something done fast,
which means pipelined. Maybe not everyone does that.
 
> But for X/K with constant K, it can (in every case I've seen) be 
> implemented with a multiplier block, or simply by wire if K is 
> a power of 2.  That gets us done in a single cycle.
 
> Sure that multiplier block may blow up into horrible cross-multiplies 
> spanning multiple blocks if I ask for a stupidly large K.  But the same 
> can be said for X*K, and the tool lets me request that just fine, and if 
> I ask for a stupid K I get a mess of logic that either a) only meets 
> timing with a slow clock or b) requires me to make a few stages of 
> register pushback available.

For software, you usually have (N)*(N)=(2N) and (2N)/(N)=(N)

In the FPGA case, even though the hardware is 18 bits, you can
choose any number of bits for your actual values. 

I am pretty sure that if you have one more bit in the constant
than you need in the quotient, that it is enough. I am not sure
that it always is when you have the same number of bits.

That is, I am not sure that you can generate a 32 bit signed 
quotient as the high half of a 32 bit multiply for all possible 32
bit signed divisors. 

-- glen

Article: 157847
Subject: Re: Division by a constant
From: Nikolaos Kavvadias <nikolaos.kavvadias@gmail.com>
Date: Fri, 17 Apr 2015 06:35:37 -0700 (PDT)
Links: << >> << T >> << A >>

Hi Rob and Glen

have you used kdiv, my constant division routine generator?

It produces low-level ("assembly") and C implementations for constant division.

http://github.com/nkkav/kdiv

Many people are happy with it; it is based on Warren's Hackers' Delight.

Best regards,
Nikolaos Kavvadias
http://www.nkavvadias.com



> 
> (snip, I wrote)
> >> With the multiply instruction on many processors, that generates a
> >> double length signed product, I believe that for many constants,
> >> maybe half of them, there is a multiplier that will generate the
> >> appropriate truncated quotient in the high half of the product.
>  
> > Right, and I've never seen a multiplier block in an FPGA that 
> > doesn't do the same.  For a while they were 18x18=>36, then 
> > I started hitting 18*25=>43, but regardless, the hard 
> > multiplier blocks always generate 
> > P'length = A'length + B'length.
>  
> >> But in the case the OP asked, it isn't so obvious that it should do
> >> that.
>  
> >> Another choice would be a primitive that would generate the appropriate
> >> multiplier.
>  
> > Ugh, but then you'd have to instantiate it as a separate block and wire 
> > it in.  That's even uglier than having to put the bit-shifting logic in 
> > manually.
> 
> Yes it is ugh, but you know that you are asking for one. 
>  
> >> Often when you want a divider, you want it pipelined, 
> >> which is unlikely to be synthesized.
>  
> > Not really.  Often when I want a divider I have to pipeline it because 
> > the division algorithm is inherently serial.  But I don't _want_ it to be 
> > pipelined, that's just the only choice I've got for a true X/Y divide.
> 
> Well, I usually go the FPGA route when I want something done fast,
> which means pipelined. Maybe not everyone does that.
>  
> > But for X/K with constant K, it can (in every case I've seen) be 
> > implemented with a multiplier block, or simply by wire if K is 
> > a power of 2.  That gets us done in a single cycle.
>  
> > Sure that multiplier block may blow up into horrible cross-multiplies 
> > spanning multiple blocks if I ask for a stupidly large K.  But the same 
> > can be said for X*K, and the tool lets me request that just fine, and if 
> > I ask for a stupid K I get a mess of logic that either a) only meets 
> > timing with a slow clock or b) requires me to make a few stages of 
> > register pushback available.
> 
> For software, you usually have (N)*(N)=(2N) and (2N)/(N)=(N)
> 
> In the FPGA case, even though the hardware is 18 bits, you can
> choose any number of bits for your actual values. 
> 
> I am pretty sure that if you have one more bit in the constant
> than you need in the quotient, that it is enough. I am not sure
> that it always is when you have the same number of bits.
> 
> That is, I am not sure that you can generate a 32 bit signed 
> quotient as the high half of a 32 bit multiply for all possible 32
> bit signed divisors. 
> 
> -- glen

Article: 157848
Subject: Choosing the right FPGA board
From: "FrewCen" <105208@FPGARelated>
Date: Mon, 20 Apr 2015 07:01:40 -0500
Links: << >> << T >> << A >>

Hello!

I have several years of experience in programming, and I'd like to move 
on to FPGAs to enjoy more fun.

As I have a limited budget for my playing with electronics, I'd like to 
choose the most versatile board for the best price with a decent support 
from manufacturer. I'm a student, so I guess the academic prices apply 
for me.

I tried to do my own research on google. What I wanted to have on my 
board was:
 - VGA/HDMI port
 - SD card slot
 - some memory
 - PS/2 keyboard
 - USB and Enthernet, although I have almost no idea about how these two 
work


I found these boards:

> Basys™2 - Xilinx Spartan-3E, 8-bit VGA, PS/2 - 69$
http://www.digilentinc.com/Products/Detail.cfm?Nav...
> Basys™3 - Xilinx Artix-7, 12-bit VGA, USB host for kb/mice, flash -
79$
http://www.digilentinc.com/Products/Detail.cfm?Nav...
> miniSpartan6+ - Spartan 6 LX 9, HDMI, serial flash, microSD - 75$
http://www.scarabhardware.com/product/minisp6/
> ZYBO Zynq™-7000 - Xilinx Z-7010, Cortex-A9, flash, memory, SD, USB,
gigabit 
Ethernet, HDMI, 16-bit VGA - 125$
http://www.digilentinc.com/Products/Detail.cfm?Nav...
> Altera DE0 Board - Altera Cyclone III 3C16, 4-BIT VGA, SD, serial port,
PS/2, 
flash - 81$
http://www.terasic.com.tw/cgi-bin/page/archive.pl?...
> Altera DE0-CV Board - Altera Cyclone V 5CEBA4F23C7N, 4-bit VGA, microSD,
PS/2 - 
99$
> Altera DE1 Board - Altera Cyclone II 2C20, 4-bit R-2R per channel VGA,
PS/2, SD, 
flash - 127$

here's where I can't decide. Again, cost is important for me, but I also 
know that Digilent and Terasic are Some Names.

What would you choose? Do you have any of your own recommendations?
Please help, I'm honestly an absolute nooob here.


---------------------------------------
Posted through http://www.FPGARelated.com

Article: 157849
Subject: Re: Choosing the right FPGA board
From: jim.brakefield@ieee.org
Date: Mon, 20 Apr 2015 07:49:38 -0700 (PDT)
Links: << >> << T >> << A >>

On Monday, April 20, 2015 at 7:01:43 AM UTC-5, FrewCen wrote:
> Hello!
> 
> I have several years of experience in programming, and I'd like to move 
> on to FPGAs to enjoy more fun.
> 
> As I have a limited budget for my playing with electronics, I'd like to 
> choose the most versatile board for the best price with a decent support 
> from manufacturer. I'm a student, so I guess the academic prices apply 
> for me.
> 
> I tried to do my own research on google. What I wanted to have on my 
> board was:
>  - VGA/HDMI port
>  - SD card slot
>  - some memory
>  - PS/2 keyboard
>  - USB and Enthernet, although I have almost no idea about how these two 
> work
> 
> 
> I found these boards:
> 
> > Basys(tm)2 - Xilinx Spartan-3E, 8-bit VGA, PS/2 - 69$
> http://www.digilentinc.com/Products/Detail.cfm?Nav...
> > Basys(tm)3 - Xilinx Artix-7, 12-bit VGA, USB host for kb/mice, flash -
> 79$
> http://www.digilentinc.com/Products/Detail.cfm?Nav...
> > miniSpartan6+ - Spartan 6 LX 9, HDMI, serial flash, microSD - 75$
> http://www.scarabhardware.com/product/minisp6/
> > ZYBO Zynq(tm)-7000 - Xilinx Z-7010, Cortex-A9, flash, memory, SD, USB,
> gigabit 
> Ethernet, HDMI, 16-bit VGA - 125$
> http://www.digilentinc.com/Products/Detail.cfm?Nav...
> > Altera DE0 Board - Altera Cyclone III 3C16, 4-BIT VGA, SD, serial port,
> PS/2, 
> flash - 81$
> http://www.terasic.com.tw/cgi-bin/page/archive.pl?...
> > Altera DE0-CV Board - Altera Cyclone V 5CEBA4F23C7N, 4-bit VGA, microSD,
> PS/2 - 
> 99$
> > Altera DE1 Board - Altera Cyclone II 2C20, 4-bit R-2R per channel VGA,
> PS/2, SD, 
> flash - 127$
> 
> here's where I can't decide. Again, cost is important for me, but I also 
> know that Digilent and Terasic are Some Names.
> 
> What would you choose? Do you have any of your own recommendations?
> Please help, I'm honestly an absolute nooob here.
> 
> 
> ---------------------------------------
> Posted through http://www.FPGARelated.com

I'd stick with the newer FPGAs.  To make your learning as relevant as possible.

Another approach is to pay to go to a seminar where you get to keep the FPGA board.  Cyclone V SOC or SmartFusion2 for $99 each.

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search